
OpenAI’s chatGPT has woke up a collective consciousness of what Giant
Language Fashions (LLMs) are able to. With that awakening comes a each day
march of LLM information: new merchandise, new options, new fashions, new
capabilities, (and new worries). It appears we’re within the early phases of a
Cambrian explosion of LLMs and LLM powered instruments; it’s not but clear how
LLMs will affect and affect our skilled and private lives, however
it appears clear that they are going to, not directly.
Since LLMs are right here to remain, it’s worthwhile to take a while to
perceive how these fashions work from a first-principles perspective.
Beginning with the mechanics may help foster sturdy intuitions that can
inform our utilization of those fashions now and sooner or later. (Particularly if
the longer term is one the place LLMs are a staple of the info scientist’s
toolbox, as widespread as an lm() perform name).
And what higher manner is there to study than by doing. So with that
preamble, on this put up we’ll stroll by an implementation of an LLM,
LLaMA (Touvron et al. 2023)
particularly, in TensorFlow and Keras, with the purpose being to develop
understanding first, functionality second.
Why LLaMA? With the sheer quantity of LLM associated content material and information out
there, it may possibly appear formidable to know the place to get began. Virtually weekly
it appears there’s a new mannequin introduced. Shopping some hubs of LLM
exercise (HuggingFace,
TFHub,
reddit,
HackerNews) muddies the waters even
extra. Learn how to decide a selected mannequin?
Of the various LLM-related information gadgets up to now months, one which stands
head-and-shoulders above the group is the launch of
LLaMA,
a contemporary, foundational LLM made out there to the general public by Meta AI in
Februay 2023. On widespread benchmarks, LLaMA outperforms OpenAI’s GPT-3,
whereas being considerably smaller (although nonetheless giant).
LLaMA is a good beginning place as a result of it’s a easy and fashionable
structure, has glorious efficiency on benchmarks, and is open. The
mannequin structure has had only a few new concepts integrated into it since
the unique Transformer structure first described in,
“Consideration Is All You Want”
printed from Google (Vaswani et al. 2017). 4 totally different sizes of
LLaMA have been launched: 7 billion and 13 billion parameter fashions
skilled on 1 Trillion tokens, and 33 billion and 65 billion parameter
fashions skilled on 1.4 trillion tokens. This is a gigantic quantity of
coaching knowledge these fashions have seen–the biggest 65B mannequin has been
skilled on roughly the “Chinchilla
compute-optimum” (Hoffmann et al. 2022)
variety of tokens, whereas the smaller LLaMAs are considerably
past that optimum. On this weblog put up we’ll concentrate on the smallest, 7B
parameter LLaMA mannequin, which you’ll comfortably load domestically and run on
CPU with solely 64Gb of RAM.
Whereas not strictly essential, to observe alongside domestically, you’ll in all probability
wish to purchase the pre-trained LLaMA weights one
manner or
one other. Notice, the
weights do include their very own license, which you’ll preview
right here.
So, with out additional ado, let’s get began.
Setup
First, we’ll wish to set up the required R and Python packages, and
configure a digital atmosphere:
remotes::install_github(c("rstudio/reticulate",
"rstudio/tensorflow",
"rstudio/keras"))
reticulate::virtualenv_create("./.venv", model = "3.10")
tensorflow::install_tensorflow(envname = "./.venv", model = "launch")
With that out of the best way, let’s load some packages and put together our R
session:
library(purrr)
library(envir)
library(tensorflow)
library(tfautograph)
library(keras)
use_virtualenv("./.venv")
choices(tensorflow.extract.warn_tensors_passed_asis = FALSE)
attach_eval({
import_from(glue, glue)
import_from(jsonlite, read_json)
import_from(withr, with_dir, with_options)
import_from(keras$layers, Dense)
np <- reticulate::import("numpy", convert = FALSE)
seq_len0 <- perform(x) seq.int(from = 0L, size.out = x)
})
Should you’ve acquired the pre-trained weights, it’ll be handy to
convert them from the torch checkpoint format to one thing that’s extra
framework agnostic (you solely want to do that as soon as, in fact):
# reticulate::py_install("torch", pip = TRUE)
torch <- reticulate::import("torch", convert = FALSE)
with_dir("~/github/facebookresearch/llama/weights/LLaMA/7B", {
pretrained_weights <- torch$load("consolidated.00.pth",
map_location = "cpu")
for (title in names(pretrained_weights)) {
filename <- sprintf("%s.npy", title)
array <- pretrained_weights[[nm]]$numpy()
np$save(filename, array)
message(glue(
"wrote: '{basename(filename)}' with form: {array$form}"))
}
})
We’ll additionally outline a helper perform so we will keep away from having to retype the
full path to our weights:
weights_path <- perform(filename) normalizePath(file.path(
"~/github/facebookresearch/llama/weights/LLaMA/",
glue(filename, .envir = mum or dad.body())), mustWork = TRUE)
And cargo the mannequin configuration parameters particular to the 7B LLaMA,
which we’ll use to construct the mannequin.
params <- read_json(weights_path("7B/params.json"))
str(params)
Record of 6
$ dim : int 4096
$ multiple_of: int 256
$ n_heads : int 32
$ n_layers : int 32
$ norm_eps : num 1e-06
$ vocab_size : int -1
Tokenizer
The primary element to LLaMA is the tokenizer, which converts textual content to a
sequence of integers. The LLaMA mannequin makes use of the
SentencePiece tokenizer from
Google. SentencePiece is accessible as a TensorFlow graph operation
by
tf_text.SentencepieceTokenizer,
and in addition as a Keras layer in
keras_nlp.tokenizers.SentencepieceTokenizer.
By alternative of a coin flip, we’ll use the lower-level tf_text interface.
tf_text <- reticulate::import("tensorflow_text")
tokenizer_path <- weights_path("tokenizer.mannequin")
tokenizer <- tf_text$SentencepieceTokenizer(
tf$io$gfile$GFile(tokenizer_path, "rb")$learn(),
add_bos = TRUE, add_eos = FALSE,
)
Let’s try it out with a immediate:
immediate <- "One of the best ways to draw bees"
tokenizer$tokenize(immediate)
tf.Tensor([ 1 450 1900 982 304 13978 367 267], form=(8), dtype=int32)
immediate |> tokenizer$tokenize() |> tokenizer$detokenize()
tf.Tensor(b'One of the best ways to draw bees', form=(), dtype=string)
Let’s outline a show_tokens() helper perform and play with the
tokenizer somewhat.
show_tokens <- perform(what) > as.integer()
else
token_ids <- as.integer(what)
tokens <- token_ids
show_tokens(immediate)
1 450 1900 982 304 13978 367 267
"" "The" "greatest" "manner" "to" "appeal to" "be" "es"
Notice that “bees” is 2 tokens. Not each token corresponds to a phrase.
For instance, one non-word token we will reliably anticipate to point out up in a
tokenizer skilled on a corpus of English textual content is “ing.” Nonetheless, when the
“ing” token reveals up won’t at all times observe your intuitions, as a result of
widespread phrases get their very own token id, even when they are often decomposed into
a number of tokens.
1 2348
"" "ing"
1 1985
"" "working"
1 8525 292
"" "flex" "ing"
1 2113 9292
"" "gained" "king"
One other factor to notice in regards to the tokenizer is that every token sequence
begins with token id 1. This can be a particular beginning-of-sequence
token that we requested be added once we loaded the tokenizer with
add_bos = TRUE. There are two different such particular tokens that we are going to
encounter later: an end-of-sequence particular tokens with id 2, and an
unknown-token with id 0.
as.character(tokenizer$id_to_string(0L))
[1] "<unk>"
as.character(tokenizer$id_to_string(1L))
[1] "<s>"
as.character(tokenizer$id_to_string(2L))
[1] "</s>"
1 0 2
"" " ⁇ " ""
Total, there are 32,000 tokens.
as.integer(tokenizer$vocab_size())
[1] 32000
One final remark is that the extra continuously encountered tokens are
assigned decrease ids.
show_tokens(seq(50, len = 10))
50 51 52 53 54 55 56 57 58 59
"/" "0" "1" "2" "3" "4" "5" "6" "7" "8"
show_tokens(seq(100, len = 10))
100 101 102 103 104 105 106 107 108 109
"a" "b" "c" "d" "e" "f" "g" "h" "i" "j"
show_tokens(seq(1000, len = 10))
1000 1001 1002 1003 1004 1005 1006 1007 1008 1009
"ied" "ER" "stat" "fig" "me" "von" "inter" "roid" "ater" "their"
show_tokens(seq(10000, len = 10))
10000 10001 10002 10003 10004 10005 10006 10007
"ång" "citep" "Ailing" "rank" "sender" "beim" "рак" "compat"
10008 10009
"happens" "diese"
show_tokens(seq(20000, len = 10))
20000 20001 20002 20003 20004 20005 20006 20007
"admit" "Remark" "стя" "Vien" "ці" "permut" "cgi" "crít"
20008 20009
"Console" "ctic"
show_tokens(seq(to = as.integer(tokenizer$vocab_size()) - 1, len = 10))
31990 31991 31992 31993 31994 31995 31996 31997 31998 31999
"ὀ" "げ" "べ" "边" "还" "黃" "왕" "收" "弘" "给"
Shifting on, the subsequent step after tokenization is embedding. An embedding
layer is successfully a dictionary lookup that converts an integer (token
id) to a 1-d float array. For this we will use the usual keras
Embedding layer.
tok_embeddings <- keras$layers$Embedding(
input_dim = tokenizer$vocab_size(),
output_dim = params$dim,
embeddings_initializer =
(...) np$load(weights_path("7B/tok_embeddings.weight.npy"))
)
tok_embeddings(3L) |> str()
<tf.Tensor: form=(4096), dtype=float32, numpy=…>
immediate |> # "One of the best ways to draw bees"
tokenizer$tokenize() |>
tok_embeddings() |>
str()
<tf.Tensor: form=(8, 4096), dtype=float32, numpy=…>
TransformerBlock
As soon as it’s tokenized and embedded, the enter then passes by the majority
of the mannequin, a sequence of repeating TransformerBlock layers. The 7B
mannequin has 32 of those TransformerBlock layers, whereas the 65B mannequin has
80 of them.
weights_path("7B/params.json") |> read_json() |> _$n_layers
[1] 32
weights_path("65B/params.json") |> read_json() |> _$n_layers
[1] 80
Here’s what the transformer block appears to be like like:
TransformerBlock(keras$layers$Layer) %py_class% {
initialize <- perform(attn_head_size, attn_n_heads,
norm_eps = k_epsilon(), ...,
block_id = NULL) {
tremendous$initialize(...)
self$consideration <- Consideration(attn_head_size, attn_n_heads,
block_id = block_id)
self$feed_forward <- FeedForward(
hidden_dim = 4 * attn_head_size * attn_n_heads,
block_id = block_id)
self$attention_norm <- RMSNorm(eps = norm_eps,
block_id = block_id,
feeds_into = "consideration")
self$feed_forward_norm <- RMSNorm(eps = norm_eps,
block_id = block_id,
feeds_into = "ffn")
}
name <- perform(x) >
self$attention_norm()
}
Whereas there may be not lots of code, there are lots of concepts packed in
there. This block varieties the principle trunk of the mannequin, so it’s value
taking the time to undergo it slowly.
We implement the TransformerBlock as a subclassed
keras.layers.Layer. That is provides us some niceties like the flexibility to
compose with different Keras layers, however these are largely irrelevant to the
goal of this weblog put up; we might simply as simply implement this as,
for instance, a vanilla R6 class. Our TransformerBlock class has two
strategies: initialize, referred to as once we first create the block, and
name, referred to as once we run the ahead go of the block.
In initialize, we create 4 layers: an Consideration layer, a
FeedForward layer, and a pair of RMSNorm layers. We’ll take an in depth have a look at
every of those quickly, however even earlier than we achieve this, we will see how they match
collectively by trying on the TransformerBlock$name() technique.
The name technique has a couple of easy concepts. In no specific order, the
first one to look at is the composition sample of including residuals.
x2 <- x |> ...
x <- x + x2 # add residual x to x2
This can be a widespread sample that helps with mannequin coaching, and particularly
to assist with the vanishing gradient
downside. It’s
a skip-connection within the other-wise linear sequence of matrix
transformations. It reinjects info (throughout the ahead go), and
gradients (throughout again propagation), again into the trunk. You possibly can suppose
of those residual connections as releasing the learnable layers in-between
(the ... within the pseudo code) from the burden of getting to
“pass-through” or “protect” info in x, permitting the weights to
as a substitute concentrate on studying transformations which might be, (in corporatese
vernacular), value-adding.
The following composition sample to notice is the repeating utilization of a
normalization layer:
x2 <- x |> norm() |> ...
x <- x + x2
There are numerous sorts of normalization layers, however to barely
over-generalize, they’ll all be considered a stabilizer that helps
with coaching. Like their deep-learning cousins the regularizers, their
foremost perform is to maintain values passing by in a smart vary–in
the ball park of (-1, 1), usually. We’ll take a better have a look at
RMSNorm quickly.
Stripped of two tips which might be largely there to assist the mannequin prepare,
residuals and normalization, the core of the TransformerBlock is simply
this:
x |> consideration() |> feed_forward()
In a second we’ll see that that feed_foward is a barely fancier
variation of a standard sequence of Dense layer. Earlier than we get
there we will we safely skip forward to distill the next instinct: a
TransformerBlock is mainly an Consideration layer adopted by a couple of
(fancy) dense layers, with some easy composition patterns (tips)
that assist with coaching. Consideration is the center of the mannequin: it’s the
most fascinating, and in addition probably the most concerned.
With the framing in place, let’s undergo and take a better have a look at
RMSNorm, FeedForward, after which with the inspiration in place, we’ll
flip our consideration to Consideration.
RMSNorm
RMSNorm(keras$layers$Layer) %py_class% {
initialize <-
perform(eps = 1e-6, ..., block_id = NULL, feeds_into = NULL) {
tremendous$initialize(...)
self$eps <- eps
self$block_id <- block_id
self$feeds_into <- feeds_into
}
construct <- perform(input_shape) {
# input_shape == (batch_size, seqlen, params$dim)
# self$w will broadcast over batch_size and seqlen dims.
# w_shape == (1, 1, params$dim)
w_shape <- rep(1L, size(input_shape))
w_shape[length(input_shape)] <- as.integer(input_shape) |> tail(1L)
# outline an area perform that can load
# the pretrained-weights if we equipped `block_id` and `feeds_into`
import_from({self}, block_id, feeds_into)
initializer <-if (is.null(block_id))
"ones"
else if (block_id >=0) {
(...) weights_path("7B/layers.{block_id}.{feeds_into}_norm.weight.npy") |>
np$load() |> np$expand_dims(0:1)
} else if(block_id == -1)
# load weights for the ultimate output normalization layer, which isn't
# a part of a TransformerBlock
(...) weights_path("7B/norm.weight.npy") |>
np$load() |> np$expand_dims(0:1)
self$w <- self$add_weight(form = w_shape,
initializer = initializer,
trainable = TRUE)
}
rrms <- perform(x) {
# reciprocal root imply sq. alongside the final axis
x %>% # (batch_size, seqlen, n_features)
tf$math$sq.() %>%
tf$reduce_mean(axis = -1L, keepdims = TRUE) %>% # (batch_size, seqlen, 1)
tf$math$add(self$eps) %>% # for numerical stability
tf$math$rsqrt()
}
name <- perform(x) {
x * self$rrms(x) * self$w
}
}
RMSnorm() has a single trainable tensor w. Within the ahead go, every
worth within the enter is multiplied by the reciprocal-root-mean-square of
all of the values within the characteristic axis and by w. Definitely a mouthful, however
only a easy sequence of arithmetic transformations in the long run,
designed for the categorical goal of adjusting the vary of values
passing by.
Let’s kick the tires on it:
norm <- RMSNorm()
m <- matrix(c(0, 1,
2, 3), nrow = 2)
norm(m)
tf.Tensor(
[[0. 1.4142132 ]
[0.44721353 1.3416406 ]], form=(2, 2), dtype=float32)
tf.Tensor(
[[0. 1.4142137 ]
[0.44721362 1.3416408 ]], form=(2, 2), dtype=float32)
tf.Tensor(
[[0. 1.4142137]
[0.4472136 1.3416408]], form=(2, 2), dtype=float32)
FeedForward
Subsequent up is FeedForward()
FeedForward(keras$layers$Layer) %py_class% {
initialize <- perform(hidden_dim, multiple_of = 256L,
..., block_id = NULL) {
tremendous$initialize()
if(!is.null(multiple_of)) {
hidden_dim <- hidden_dim %>%
{ as.integer( . * (2/3)) } %>%
{ (. + multiple_of - 1) %/% multiple_of } %>%
{ . * multiple_of }
}
self$hidden_dim <- hidden_dim
self$block_id <- block_id
}
construct <- perform(input_shape) {
output_dim <- input_shape |> as.integer() |> tail(1)
if(is.null(self$block_id))
load_weight <- (...) NULL
else
load_weight <- (title) (...) np$load(weights_path(
"7B/layers.{self$block_id}.feed_forward.{title}.weight.npy"))$`T`
self$w1 <- Dense(self$hidden_dim, use_bias = FALSE,
kernel_initializer = load_weight("w1"))
self$w2 <- Dense(output_dim, use_bias = FALSE,
kernel_initializer = load_weight("w2"))
self$w3 <- Dense(self$hidden_dim, use_bias = FALSE,
kernel_initializer = load_weight("w3"))
tremendous$construct(input_shape)
}
name <- perform(x) {
import_from({self}, w1, w2, w3)
import_from(tf$nn, silu)
x %>%
{ silu(w1(.)) * w3(.) } %>% # SwiGLU
w2()
}
}
FeedForward consists of three Dense layers. initialize does some
easy arithmetic, munging on the enter worth hidden_dim to make sure the
dimension is a performant a number of of 256, and construct is generally boiler plate
for creating the layers and loading the weights.
The novelty of FeedForward() is within the name() technique, the place slightly
than composing the Dense layers in a standard sequential mannequin
with, say, ReLU activations in between and perhaps some dropout, the
layers are composed to kind a “SwiGLU” unit. The publication by Shazeer (2020)
of SwiGLU and different variations on GLU is an exemplar of the kinds
of explorations and enhancements across the Transformer structure
since its preliminary publication in
2017; a gradual accretion of
enhancements that has introduced us to at the moment. The Feedforward$name() is
only a single SwiGLU adopted by a linear projection. In its essence,
it’s a intelligent composition of three (realized) linear projections, an
element-wise multiplication, and a silu()
activation
perform.
Maybe probably the most shocking remark to make right here is the relative
dearth of activation capabilities, and even non-linearities, not simply in
FeedForward, however total. The silu() on this feedforward, the
reciprocal-root-mean-square in RMSnorm(), and a softmax() in
Consideration() are the one non-linear transformations in the entire
sequence of TransformerBlocks. Every part else is a linear
transformation!
Consideration
Lastly, let’s flip our consideration to Consideration().
Consideration(keras$layers$Layer) %py_class% {
initialize <- perform(head_size, n_heads,
..., block_id = NULL) {
tremendous$initialize(...)
self$head_size <- head_size
self$n_heads <- n_heads
if (is.null(block_id))
load_weight <- perform(title) NULL
else
load_weight <- (title) (...) np$load(weights_path(
"7B/layers.{block_id}.consideration.{title}.weight.npy"))$`T`
Dense <- perform(title) keras$layers$Dense(
models = n_heads * head_size,
use_bias = FALSE,
kernel_initializer = load_weight(title)
)
self$wq <- Dense("wq")
self$wk <- Dense("wk")
self$wv <- Dense("wv")
self$wo <- Dense("wo")
}
name <- perform(x) {
c(batch_size, seqlen, n_features) %<-% tf$unstack(tf$form(x))
# 1. undertaking (linear remodel) x into
# question, key, and worth tensors
# 2. reshape q ok v, splitting out the final dim (n_features)
# into n_heads impartial subspaces,
# every with dimension head_size.
# (n_features == head_size * n_heads)
split_heads_shape <- c(batch_size, seqlen,
self$n_heads, self$head_size)
q <- x |> self$wq() |> tf$reshape(split_heads_shape)
ok <- x |> self$wk() |> tf$reshape(split_heads_shape)
v <- x |> self$wv() |> tf$reshape(split_heads_shape)
# embed positional info in question and key
# (bsz, seqlen, n_heads, head_size)
q %<>% apply_rotary_embedding()
ok %<>% apply_rotary_embedding()
# reshape:
# transfer heads out of the final 2 axes,
# so later matmuls are carried out throughout the subspaces (heads)
# between (seqlen, head_size) axes
v <- tf$transpose(v, c(0L, 2L, 1L, 3L)) # (bsz, n_heads, seqlen, head_size)
q <- tf$transpose(q, c(0L, 2L, 1L, 3L)) # (bsz, n_heads, seqlen, head_size)
ok <- tf$transpose(ok, c(0L, 2L, 3L, 1L)) # (bsz, n_heads, head_size, seqlen)
# calculate and normalize consideration scores
scores <- q %*% ok # (bsz, n_heads, seqlen, seqlen)
scores <- scores / sqrt(self$head_size) # scale
# apply causal masks, so the mannequin cannot "look forward" throughout coaching
masks <- make_mask(seqlen, dtype = scores$dtype)
scores %<>% { . + masks }
scores <- tf$nn$softmax(scores, axis = -1L)
# modify values tensor with consideration scores
# scores (bsz, n_heads, seqlen, seqlen)
# v (bsz, n_heads, seqlen, head_size)
output <- scores %*% v # (bsz, n_heads, seqlen, head_size)
# mix heads again right into a single options dim,
# so Consideration output_shape==input_shape
output <- output |>
tf$transpose(c(0L, 2L, 1L, 3L)) |> # (bsz, seqlen, n_heads, head_size)
tf$reshape(tf$form(x)) # (bsz, seqlen, n_heads * head_size)
# another trainable linear projection for good luck
output <- self$wo(output) # (bsz, seqlen, n_heads * head_size)
output
}
}
Consideration in LLaMA is comparable however not equivalent to the Consideration
described within the unique Transformers
paper (and out there as a keras
builtin beneath keras$layers$MultiHeadAttention()). The core novelty is
the addition of the apply_rotary_embedding() perform, which we’ll
describe shortly. The extra novelty is balanced by the simplicity
from the truth that the layer is performing self-attention—we don’t want
to go in numerous question, key, and worth tensors (or purpose about what
which means), because the similar enter serves all three roles. Notice that the
standard MultiHeadAttention() layer is roofed fairly completely in
the 2nd Version of Deep Studying with R,
together with a full implementation of consideration in base R.
To develop an understanding of the mechanics in a layer like this, it’s
useful to quickly unsee among the minutia that may act as a fog
obscuring the essence of the operation. On this occasion, if we
quickly strip out the transpose()s and reshape()s (as intelligent and
very important as they’re), that is what’s left:
name <- perform(x) > self$wq()
ok <- x
Returning to the transpose()s and reshapes(), you possibly can observe that
their goal is to make it in order that the eye calculations are
carried out throughout n_heads impartial subspaces, slightly than in a
single bigger house. The identical reasoning drives this resolution as that
driving utilization of depthwise-separable convolutions in picture fashions.
Empirically, for the fastened compute funds, factoring options into
impartial subspaces performs higher than doing the identical core
operations in single bigger characteristic house. As with all issues, there may be
a stability to strike between n_heads (the variety of subspaces) and
head_dim (the scale of every subspace). The LLaMA authors have struck
the stability like this on the numerous mannequin sizes:
lapply(c("7B", "13B", "30B", "65B"), (dimension) {
p <- read_json(weights_path("{dimension}/params.json"))
with(p, checklist(llama_size = dimension,
n_heads = n_heads,
head_dim = dim %/% n_heads))
}) |> dplyr::bind_rows()
# A tibble: 4 × 3
llama_size n_heads head_dim
<chr> <int> <int>
1 7B 32 128
2 13B 40 128
3 30B 52 128
4 65B 64 128
Subsequent lets flip our consideration to the causal consideration masks.
make_mask <- perform(seqlen, dtype = k_floatx()) {
x <- tf$vary(seqlen)
masks <- tf$the place(x[, tf$newaxis] < x[tf$newaxis, ],
tf$fixed(-Inf, dtype = dtype),
tf$fixed(0, dtype = dtype))
# broadcast over batch and heads dim
masks[tf$newaxis, tf$newaxis, , ] # (1, 1, seqlen, seqlen)
}
The masks is a strictly higher triangular matrix crammed with -Inf
values. Including the masks to the eye scores prevents the mannequin from
with the ability to “look forward” and see the eye rating for a token
pairing it hasn’t seen but at a selected place within the sequence.
This want for a masks is greatest considered a vestige from coaching,
an equipment that the mannequin wanted to study with and now it may possibly’t perform with out.
Throughout coaching, gradients are calculated for predictions from all
token positions in a sequence, together with predictions tokens the place the proper
reply is proper there, because the very subsequent token in similar sequence. The masks
prevents the mannequin from with the ability to cheat and look forward into the longer term,
one thing it gained’t have the ability to do as soon as it’s we’re operating it for inference.
tf.Tensor(
[[[[ 0. -inf -inf -inf -inf]
[ 0. 0. -inf -inf -inf]
[ 0. 0. 0. -inf -inf]
[ 0. 0. 0. 0. -inf]
[ 0. 0. 0. 0. 0.]]]], form=(1, 1, 5, 5), dtype=float32)
Rotary Place Embedding
Subsequent lets flip our consideration to apply_rotary_embedding(). This core
innovation was printed by Su et al. (2022) within the paper titled
“RoFormer: Enhanced Transformer with Rotary Place Embedding”.
Some context:
-
The naked
Consideration()mechanism doesn’t go away any risk for a
token’s place in a sequence to have an effect on the eye scores, since
solely token-pairs are scored. Consideration treats its enter like a
bag-of-tokens. -
The place of a token in a sequence is clearly necessary, and the
consideration layer ought to have entry to that info. -
Absolutely the place of a token in a sequence is much less necessary
than the relative place between tokens. (Particularly so for lengthy
sequences).
Which leads us into the complicated aircraft. If we think about the options as
complicated numbers, we will rotate them, and we will calculate angles between
them. From the Roformers paper:
Particularly, incorporating the relative place embedding is
simple: merely rotate the affine-transformed phrase embedding
vector by quantity of angle multiples of its place index and thus
interprets the instinct behind Rotary Place Embedding
Increasing barely: the rotation matrix is designed in order that
subsequently, after rotating our q and ok token sequence embedding
the identical manner, the angle between token options is a perform of the
relative distance between these tokens within the token sequence. The
relative angle between two tokens is invariant to absolutely the
place of these tokens within the full sequence.
Briefly, the rotation injects positional info. The which means or
interpretability of that positional info, or how it’s meant to
be used, and even extracted from the results of q %*% ok, is left to the
mannequin to study.
Right here is the code:
apply_rotary_embedding <- perform(x) {
c(., seqlen, ., head_size) %<-%
tf$unstack(tf$form(x))
rotation_matrix <- compute_rotation_matrix(seqlen, head_size)
x %>%
view_as_complex() %>%
{ . * rotation_matrix } %>%
view_as_real()
}
compute_rotation_matrix <-
perform(seqlen, feature_dim, theta = 10000) {
# `feature_dim` right here goes to be consideration$head_size
# `seqlen` goes to match the token sequence size.
t <- tf$vary(seqlen, dtype = tf$float32)
freqs <- tf$vary(begin = 0, restrict = 1, delta = 1 / (feature_dim %/% 2),
dtype = tf$float32)
tf_assert(tf$dimension(freqs) == feature_dim %/% 2)
freqs <- 1.0 / (theta ^ freqs)
# outer product; (seqlen, head_size/2)
freqs <- tf$einsum('a,b->ab', t, freqs)
rot_mat <- tf$complicated(tf$cos(freqs), tf$sin(freqs))
# the positional embedding shall be broadcast throughout batch and heads dim
rot_mat[tf$newaxis, , tf$newaxis, ] #(1, seqlen, 1, headdim/2)
}
view_as_complex <- perform(x) {
tf$complicated(x[all_dims(), `::2`],
x[all_dims(), `2::2`])
}
view_as_real <- perform(x) {
# xs = (..., f); xs2 = (..., f*2)
xs <- tf$form(x)
xs2 <- tf$concat(checklist(xs[1:(length(xs)-1)],
xs[length(xs), drop = FALSE] * 2L),
axis = 0L)
x2 <- tf$stack(checklist(Re(x), Im(x)), axis = -1L)
# (..., f, 2) -> (..., f*2)
tf$reshape(x2, xs2)
}
As you possibly can see, to think about the embedding options as current within the
complicated aircraft, we merely deal with adjoining pairs of floats within the
underlying array as the actual and imaginary a part of a fancy quantity. We
rotate the embeddings within the complicated aircraft, then return to imagining
the options as current in the actual aircraft. Once more, the job of
decoding the which means of the options after rotation is left to the
mannequin to study.
We are able to shortly verify that the rotary embeddings solely rotate options
and don’t scale them:
close to <- perform (x, y, tol = 1e-6) abs(x - y) < tol
all(close to(1, Mod(compute_rotation_matrix(2048L, 128L))))
tf.Tensor(True, form=(), dtype=bool)
There’s another trick to look at earlier than shifting on: due to a few of
the mathematical properties of the rotation matrix, it’s potential to
keep away from doing a full complicated multiply operation and nonetheless arrive on the
similar outcome. Additionally, because the rotation matrix by no means adjustments, it makes
sense to solely compute it as soon as and cache it, like so:
precomputed_rotation_matrix <- compute_rotation_matrix(
seqlen = 2048L, # LLaMA max seqlen
feature_dim = with(params, dim %/% n_heads) # head_size
)
apply_rotary_embedding_faster <- perform(x) {
rotate_every_two <- perform(x) {
x1 <- x[all_dims(), `::2`]
x2 <- x[all_dims(), `2::2`]
x_ <- tf$stack(checklist(-x2, x1), axis = -1L)
tf$reshape(x_, tf$form(x))
}
repeat_each_twice <- perform(x) {
tf$`repeat`(x, 2L, axis = -1L)
}
seqlen <- tf$form(x)[2]
rot <- precomputed_rotation_matrix[, NA:seqlen, , ]
cos <- Re(rot) |> repeat_each_twice()
sin <- Im(rot) |> repeat_each_twice()
(x * cos) + (rotate_every_two(x) * sin)
}
rand <- tf$random$uniform(form(3, 8, params$n_heads, 128))
all(apply_rotary_embedding(rand) ==
apply_rotary_embedding_faster(rand))
tf.Tensor(True, form=(), dtype=bool)
apply_rotary_embedding <- apply_rotary_embedding_faster
Lastly, be aware that the rotary positional embeddings are utilized inside
every Consideration layer. That is totally different from the unique Transformer
implementation, the place a positional embedding was solely added as soon as on the
head of the mannequin. Just like residual connections, you possibly can consider the
presence of those repeated injections of positional info as
relieving the remaining trainable layers from the burden of allocating
a few of their weights to the duty of “passing by” or “preserving”
the positional info for later layers.
Positional embeddings are a wealthy topic that additionally comes up in different
deep studying architectures, like denoising diffusion (Falbel and Keydana 2023),
so time spent understanding them higher is time effectively
spent. For the needs of this weblog put up we’ve coated the factors
wanted and we’ll transfer on to tying all items collectively. To go deeper and
develop a extra mathematically knowledgeable perceive of RoPE, two glorious
beginning factors are:
Tying all of it collectively
With Tokenizer, Embedding, TransformerBlock (RMSNorm,
Consideration FeedForward and apply_rotary_embedding) all coated,
it’s time to tie all of the items collectively right into a Transformer mannequin. We
might do that utilizing %py_class% like with the opposite layers above, however
it’s simply as simple to maneuver over to utilizing the Keras purposeful API at this
level.
layer_transformer_block <- create_layer_wrapper(TransformerBlock)
layer_rms_norm <- create_layer_wrapper(RMSNorm)
# enter to the mannequin shall be output from the tokenizer
enter <- layer_input(form(NA)) #, dtype = "int32")
x <- enter |>
tok_embeddings() # instantiated earlier within the blog-post
for(block_id in seq_len0(params$n_layers)) >
layer_transformer_block(attn_head_size = params$dim %/% params$n_heads,
attn_n_heads = params$n_heads,
norm_eps = params$norm_eps,
block_id = block_id)
# closing output projection into logits of output tokens
x <- x |>
layer_rms_norm(block_id = -1, eps = params$norm_eps) |>
layer_dense(
tokenizer$vocab_size(), use_bias = FALSE,
kernel_initializer = (...) np$load(weights_path("7B/output.weight.npy"))$`T`
)
# slice out the logits for the final token
with_options(c(tensorflow.extract.warn_negatives_pythonic = FALSE), {
output <- x[, -1, ]
})
llama <- keras_model(enter, output) %>%
compile(jit_compile = TRUE)
The enter to the mannequin is tokenized textual content and the output is the
(unnormalized) chances for every token in tokenizer$vocab_size()
being the subsequent token within the sequence.
next_token_probs <- immediate %>%
tokenizer$tokenize() %>%
llama()
next_token_probs
tf.Tensor(
[[-2.4503722e+00 -3.4463339e+00 1.3200411e+01 ... 4.8804146e-01
-1.3277926e+00 9.9985600e-03]], form=(1, 32000), dtype=float32)
Sampling methods for choosing a token from the token logits is a
wealthy subject, (additionally coated completely within the Deep Studying with
R e-book), however this weblog put up is lengthy sufficient
already. So for now, let’s simply take the argmax().
sampler <- (logits) tf$argmax(logits, axis = -1L, output_type = "int32")
(next_token <- sampler(next_token_probs))
tf.Tensor([304], form=(1), dtype=int32)
tokenizer$detokenize(next_token) |> as.character()
[1] "to"
Let’s run it for a couple of tokens and let LLaMa end the sentence:
prompt_tokens <- tokenizer$tokenize("One of the best ways to draw bees")
for (i in 1:20) {
next_token_probs <- prompt_tokens |> llama()
next_token <- sampler(next_token_probs)
prompt_tokens %<>% { tf$concat(c(., next_token), axis = -1L) }
# finish of sentence
if (as.logical(next_token == tokenizer$string_to_id(".")))
break
}
prompt_tokens |>
tokenizer$detokenize() |>
as.character() |>
strwrap(60) |> writeLines()
One of the best ways to draw bees to your backyard is to plant a
number of flowers that bloom at totally different instances.
Wrapping up
On this weblog put up we’ve walked by the LLaMA structure
applied in R TensorFlow, together with the right way to load pretrained weights,
after which run the mannequin to generate a sentence. Notice, a lot of the code in
this weblog put up is tailor-made for didactic functions. Whereas the
implementation of the LLaMA structure coated on this weblog put up is
acceptable for coaching, there are a couple of modifications you’ll wish to
make earlier than doing lots of textual content technology. These embrace issues like:
-
Within the
Considerationlayer, caching theokandvtensors. Then,
after the primary ahead go with the preliminary immediate, solely feeding
the mannequin the one new token from thesampler(), slightly than
feeding the mannequin all of the tokens of the complete immediate on every ahead
go. -
Solely producing the causal masks
make_mask()androtary_matrix
slices as soon as per ahead go, as a substitute of inside everyConsideration
name. -
Updating the
TransformerBlockto be cache-aware and to go
by the suitable arguments toConsideration() -
Wrapping all the extra book-keeping logic in a customized
TransformerDecoder()class.
The adjustments required to implement these optimizations for inference
balloon the code dimension and are largely about book-keeping, so we gained’t go
by them on this weblog put up. Nonetheless, you could find a fuller
implementation of LLaMA in R Tensorflow, together with a cache-aware
generate() technique that solely feeds the mannequin one token at a time throughout
the principle inference loop, (and compiles to XLA!),
right here.
That’s all for now. Thanks for studying and comfortable travels to all
exploring this thrilling LLM terrain!
Picture by Sébastien Goldberg on Unsplash

Казино Vavada предлагает захватывающий мир азартных игр и возможность выиграть крупные суммы денег. Чтобы начать играть, необходимо [url=https://play-casino-vavada.online/]пройти регистрацию в Вавада[/url].
[url=https://play-casino-vavada.online/]Регистрация в Vavada[/url] очень простая и быстрая.. Вам потребуется заполнить небольшую форму, указав свои личные данные, такие как имя, фамилия, электронная почта и номер телефона. Пожалуйста, убедитесь, что вводите правильные данные, чтобы избежать проблем при выводе выигрышей.
После заполнения формы вам будет предложено создать уникальное имя пользователя и пароль для входа в ваш аккаунт. Помните, что безопасность вашего аккаунта важна, поэтому рекомендуется использовать надежные пароли, состоящие из разных символов.
После завершения регистрации просто подтвердите свой аккаунт, перейдя по ссылке, которую вы получите на указанную вами электронную почту. После подтверждения вы сможете войти в свой аккаунт и начать играть в любимые игры в казино Vavada.
Не забудьте ознакомиться с правилами и условиями казино, чтобы быть в курсе всех требований и ограничений. Важно играть ответственно и устанавливать лимиты для себя, чтобы не превысить свои финансовые возможности.
Удачи в [url=https://play-casino-vavada.online/]зарегистрироваться vavada[/url]! Наслаждайтесь азартом и возможностью выиграть большие призы!
Демо игровых автоматов онлайн без регистрации и депозита позволяют насладиться азартом и развлечениями казино, не тратя при этом своих финансовых средств. Это идеальный способ попробовать себя, изучить различные игры и разработать стратегии без каких-либо обязательств.
Благодаря широкому выбору демо-слотов, каждый игрок найдет что-то по своему вкусу. От классических трехбарабанных автоматов до современных видеослотов с крутейшей графикой и увлекательными бонусными раундами, вам будет чем заняться.
Играть в [url=https://lucky-slots.ru/sloty-s-pokupnymi-bonusami/]играть в слоты с покупными бонусами список[/url] легко и удобно. Вам не нужно регистрироваться и пополнять баланс – просто подберите подходящий игровой автомат и начните играть. Это отличная возможность попробовать разные стратегии ставок, изучить выигрышные комбинации и просто кайфануть в игру в казино.
Демо-режим также дает возможность вам сделать оценку отдачи игрового автомата и понять, насколько он подходит вам по стилю и предпочтениям. Вы можете играть беспконечно долго, не беспокоясь о своем бюджете.
Так что, если вы хотите испытать азарт и веселье казино, без риска для своих денег, демо игровых слотов без регистрации и пополнения баланса – это отличный способ. Заходите прямо сейчас и наслаждайтесь захватывающими игровыми приключениями!
Казино Вавада предлагает захватывающий мир азартных игр и возможность выиграть большие деньги. Чтобы начать играть, необходимо [url=https://play-casino-vavada.online/]зарегистрироваться на официальном сайте казино Vavada[/url].
[url=https://play-casino-vavada.online/]Регистрация в Вавада[/url] очень простая и быстрая.. Вам потребуется заполнить небольшую форму, указав свои личные данные, такие как имя, фамилия, электронная почта и номер телефона. Пожалуйста, убедитесь, что вводите правильные данные, чтобы избежать проблем при выводе выигрышей.
После заполнения формы вам будет предложено создать уникальное имя пользователя и пароль для входа в ваш аккаунт. Помните, что безопасность вашего аккаунта важна, поэтому рекомендуется использовать надежные пароли, состоящие из разных символов.
После завершения регистрации вам будет предложено подтвердить свою учетную запись, перейдя по ссылке, которую вы получите на указанную вами электронную почту. После подтверждения вы сможете войти в свой аккаунт и начать играть в любимые игры в казино Vavada.
Удачи в [url=https://play-casino-vavada.online/]играть вавада на деньги[/url]! Наслаждайтесь азартом и возможностью выиграть большие призы!
Отличная статья, благодарю!
В качестве благодарности поделюсь с вами информацией: наличники из дерева на оконные проемы в Питере для для загородных домов являются отличным выбором среди владельцев домов.
[url=https://kub-era.ru/nalichniki]Наличник декоративный деревянный[/url] для коттеджей – это отличный выбор, который сочетает в себе отличный внешний вид, прочность и экологию. Если у вас есть желание придать своему загородному дому особый шарм, обратите внимание на деревянные наличники.
В Санкт-Петербурге работает много организаций, специализирующихся на изготовлении и установке деревянных наличников. Одна из них – компания КубЭра. Предлагает широкий выбор моделей, цветов и отделок.
Отличная статья, благодарю!
В качестве благодарности поделюсь с вами информацией: наличники из дерева на оконные проемы в Санкт-Петербурге для для загородных домов являются превосходных выбором среди владельцев домов.
[url=https://kub-era.ru/nalichniki]Наличник декоративный деревянный[/url] для коттеджей – это отличный выбор, сочетающий в себе эстетику, прочность и экологическую чистоту. Если у вас есть желание придать своему коттеджу превосходный вшений вид, обратите внимание на наличники из дерева.
В СПб работает много организаций, которые занимаются изготовлением и монтажем деревянных наличников. Одна из них – компания КубЭра. Предлагает широкий выбор моделей, цветов и отделок.
Спасибо за информацию. От меня благодарочка: Регистрация в казино Вавада – это очень простой способ выиграть большие деньги в мире азартных игр. Вот некоторые важные моменты, которые стоит знать о регистрации в казино Vavada.
Для начала, зайдите на настоящий сайт Vavada. Сайт доступен всем: [url=https://vavada-casino3.online/]vavada online[/url]. Нажмите на кнопку «Регистрация», чтобы начать процесс создания аккаунта. Затем вам потребуется заполнить небольшую форму с основными данными, такими как имя, фамилия, электронная почта и пароль. Убедитесь, что вводите правильную информацию, чтобы избежать проблем в будущем.
После подтверждения вы сможете войти в свой аккаунт на сайте казино Vavada, используя указанный логин и пароль. Перед вами откроется весь набор азартных игр казино Вавада.
Спасибо за информацию. В благодарность предлагаю расслабиться и окунуться в мир азарта:
[url=https://vavada-registraciya.online/]Vavada регистрация[/url] – это крутой метод, который позволит вам насладиться азартными играми в лучшем онлайн-казино. Для начала, откройте официальный сайт Вавада и нажмите на кнопку “Регистрация”. Затем заполните простую форму регистрации своими личными данными.Успешно зарегистрироваашись у вас появится доступ ко всему набору азартных игр, таких как слоты, рулетка, блэкджек и многое другое. Казино Вавада также щедро раздрает бонусы, которые помогут вам увеличить свои шансы на победу. Играйте с умом и устанавливать лимиты на свои ставки. Регистрация в казино Вавада – это крутой метод насладиться азартными играми, зарядиться положительными эмоциями и, возможно, выиграть крупную сумму денег.
Hi!
One [url=https://bitcoinlog.fun/ethereum/raspberry-mining-ethereum.html]of[/url] the most trusted and renowned platforms in the cryptocurrency market is Coinbase. Founded in 2012, Coinbase has become a household name for cryptocurrency enthusiasts worldwide. This user-friendly platform provides a seamless experience for buying, selling, and storing various digital currencies, fostering the growth and acceptance [url=https://bitcoinlove.fun/buy/how-to-buy-a-small-amount-of-bitcoin.html]of[/url] cryptocurrencies in mainstream finance.
Link:https://family-gadgets.ru or [url=https://cointime.fun/best/]here[/url].
Hello!
Similar to silver, Litecoin is recognized for its value and affordability. It [url=https://helpbitcoin.fun/money/paypal-generator-free-money.html]of[/url]fers an alternative to Bitcoin with lower transaction fees and shorter block confirmation times, making it an ideal choice for those seeking quicker transactions. Litecoin’s blockchain network operates on a decentralized platform, allowing users to have full control over their transactions while maintaining security and transparency.
Link:https://bitcoinhelp.fun or [url=https://bitcoinlove.fun/cash/bitcoin-cash-abc-vs-sv-hashrate.html]here[/url].
Hi all!
One of the most trusted and renowned platforms in the cryptocurrency market is Coinbase. Founded in 2012, Coinbase has become a household name for cryptocurrency enthusiasts worldwide. This user-friendly platform provides a seamless experience for buying, selling, and storing various digital currencies, fostering the growth and acceptance of cryptocurrencies in mainstream finance.
Link:https://cointime.fun or here https://bitcoinlove.fun/online/sha-256-online-decoder.html.
Hello! Crazy discounts, hurry up!
We are Drop Dead Studio and our goal is to help companies achieve impressive sales results through automated marketing.
[b]2 keys left for sale activation key for GSA Search Engine Ranker with a 50% discount[/b], we are selling due to the closure of the department that works on this software. The price is two times lower than the official store. At the output you will receive a name\key to work with.
[b]Hurry up, keys are limited[/b] Write to us in telegram: [b]@DropDeadStudio[/b]!
The fresh database for XRumer and GSA Search Engine Ranker has gone on sale, as well as a premium database collected by us personally, it contains only those links on which you will receive active links, that is, clickable ones + our own database of 4+ million contact links, for selling electronic goods and everything that your imagination allows you!
[b]ATTENTION! 40% discount only until 04/10/2024[/b]!
When applying, please indicate the promotional code [b]DD40%[/b] in telegram: [b]@DropDeadStudio[/b]!
Hello! Crazy discounts, hurry up!
We are Drop Dead Studio and our goal is to help companies achieve impressive sales results through automated marketing.
I. We sell fresh databases for GSA Search Engine Ranker.
• The databases are updated monthly through 24/7 parsing and by connecting purchased databases from other sources.
• You have the option to purchase a one-time fresh database or pay for a lifetime subscription for updates.
ATTENTION! 40% discount only until 04/10/2024!
When applying, please indicate the promotional code DD40% in telegram: https://t.me/DropDeadStudio!
II. We sell fresh databases for XRumer 19.0.18 — 23.0.3 StrongAI.
• The databases are also updated monthly through 24/7 parsing and by connecting purchased databases from other sources.
• You have the option to purchase a one-time fresh database or subscribe for lifetime updates.
ATTENTION! 40% discount only until 04/10/2024!
When applying, please indicate the promotional code DD40% in telegram: https://t.me/DropDeadStudio!
III. We sell our own premium databases for XRumer 19.0.18 — 23.0.3 StrongAI.
• The databases are updated monthly and compiled by us through selection of all databases + including fresh ones specifically for active links, or in the case of contact forms, those links where your advertisement will be sent to the website owner.
ATTENTION! 40% discount only until 04/10/2024!
When applying, please indicate the promotional code DD40% in telegram: https://t.me/DropDeadStudio!
IV. We are selling the remaining activation keys for GSA Search Engine Ranker.
• The activation keys are still available with us from the closed department that used this software, as we have fully transitioned to XRumer.
ATTENTION! 2 keys left for sale activation key for GSA Search Engine Ranker with a 50% discount! Write to us in telegram: https://t.me/DropDeadStudio!
V. We can also help you choose and configure servers and virtual machines.
• We have 7 XRumer licenses running simultaneously 24/7, and we can also assist with setting up XRumer and Xevil.
Want to improve your SEO rankings and save time? Our premium databases for XRumer and GSA Search Engine Ranker are just what you need!
What do our databases include?
• Active links: Get access to constantly updated lists of active links from profiles, posts, forums, guestbooks, blogs, and more. No more wasting time on dead links!
• Verified and identified links: Our premium databases for GSA Search Engine Ranker include verified and identified links, categorized by search engines. This means you get the highest quality links that will help you rank higher.
• Monthly updates: All of our databases are updated monthly to ensure you have the most fresh and effective links.
Choose the right option for you:
• XRumer premium database:
o Premium database with free updates: $119
o Premium database without updates: $38
• Fresh XRumer Database:
o Fresh database with free updates: $94
o Fresh database without updates: $25
• GSA Search Engine Ranker Verified Links:
o GSA Search Engine Ranker activation key: $65 (includes database)
o Fresh database with free updates: $119
o Fresh database without updates: $38
Don’t waste time on outdated or inactive links. Invest in our premium databases and start seeing results today!
Order now!
P.S. By purchasing GSA Search Engine Ranker from us, you get a high-quality product at a competitive price. Save your resources and start improving your SEO rankings today!
To contact us, write to telegram https://t.me/DropDeadStudio
Отличная статья благодарю В качестве благодарности поделюсь с вами информацией: деревянные наличники на окна и двери в Питере для для загородных домов являются превосходных выбором среди владельцев домов. Наличник фигурный деревянный купить в спб для коттеджей – это отличный выбор сочетающий в себе отличный внешний вид прочность и экологию. Если у вас есть желание придать своему загородному дому превосходный вшений вид рассмотрите наличники из дерева. В СПб работает много организаций которые занимаются изготовлением и монтажем деревянных наличников. Одна из них – компания КубЭра. Предлагает широкий выбор моделей цветов и отделок.
Самый быстрый и безопасный сервис обмена электронных денег ждет вас на BestChange.ru. Если вы фрилансер веб-мастер получаете деньги за услуги и постоянно имеете дело с электронной валютой в сети то пользоваться традиционными сервисами не всегда удобно. Гораздо приятнее иметь дело с обменными пунктами где меньше ограничений лучше курсовая разница скидки на тарифы для постоянных клиентов. Согласны? На BestChange представлены именно такие обменные пункты «заточенные» под завсегдатаев Интернета обладающие безупречной репутацией решающие любые вопросы на раз-два-три. Проходите по ссылке — более 500 проверенных обменных онлайн-сервисов в одном месте — BestChange.ru
Immerse yourself in the world of cutting-edge technology with the global version of the POCO M6 Pro, which combines advanced features, stylish design, and an affordable price. This smartphone is designed for those who value speed, quality, and reliability.
Why is the POCO M6 Pro your ideal choice?
– Powerful Processor: The octa-core Helio G99-Ultra delivers lightning-fast performance. Gaming, streaming, multitasking—everything runs smoothly and without lag.
– Stunning Display: The 6.67-inch AMOLED screen with FHD+ resolution (2400×1080) and a 120Hz refresh rate offers incredibly sharp and vibrant visuals. With a touch sampling rate of 2160 Hz, every touch is ultra-responsive.
– More Memory, More Possibilities: Choose between the 8/256 GB or 12/512 GB configurations to store all your files, photos, videos, and apps without compromise.
– Professional Camera: The 64 MP main camera with optical image stabilization (OIS), along with additional 8 MP and 2 MP modules, allows you to capture stunning photos in any conditions. The 16 MP front camera is perfect for selfies and video calls.
– Long Battery Life, Fast Charging: The 5000 mAh battery ensures all-day usage, while the powerful 67W turbo charging brings your device back to life in just a few minutes.
– Global Version: Support for multiple languages, Google Play, and all necessary network standards (4G/3G/2G) makes this smartphone universal for use anywhere in the world.
– Convenience and Security: The built-in fingerprint sensor and AI-powered face unlock provide quick and reliable access to your device.
– Additional Features: NFC, IR blaster, dual speakers, and IP54 splash resistance—everything you need for a comfortable experience.
The POCO M6 Pro is not just a smartphone; it’s your reliable companion in the world of technology.
Hurry and grab it at a special price of just 15,000 rubles! Treat yourself to a device that impresses with its power, style, and functionality.
Take a step into the future today—purchase it on [url=https://ify.ac/1Y26]AliExpress[/url]!