cool hit counter Data types in Redis - exploring the details_Intefrankly

Data types in Redis - exploring the details


Picking up from the previous post Why Redis, today we'll talk about specific Redis data types and commands. This post is a great foundation for a deeper understanding of Redis, so sit tight, ahead Long text warning

This series is based on: redis-3.2.12 The text will not cover all the commands, but mainly the ones that you often encounter in your work.

Normally, most of the information we read tells us simply and brutally what this command does and how many parameters that command takes. This way will only know what is known, this article from the time complexity of the command to the use, and then the corresponding type in the Redis low-level use of what structure to save data, I hope to let you know more profoundly, when using the heart more.

1. Here in the reading please note.Although the time complexity of many commands is O(n), note the specific meaning of what their n represents

2. It will be used in the text OBJECT ENCODING xxx to check the internal coding of Redis, which actually reads the redisObject struct encoding The value represented by.redisObject Provides a uniform representation of different types of data.

String type

It should be said that this is the most widely used data type in Redis. Some of the commands in this type are used in a wide range of scenarios. For example.

  • caching, which is very much in use.
  • Counter/speed limiter technology.
  • The shared Session server is also based on this data type

note: The table simply statesString hit the target12 The command, use The scenes are also only partially listed。

We are often told to teach MSET/MGET such commands are used sparingly because their time complexity is O(n), but in fact, note here that n indicates the number of keys set or read this time, so if you do not read a lot of keys in bulk, and the content of each key is not very large, then use the bulk operation command instead to save the time of network requests and transfers.

internal structure

How does data of type String end up being stored in Redis? If you want to look into it, you have to start with SDS This structure speaks for itself, but today I'll press on without the details of this source code part and just talk about the data structure it holds internally. Eventually the strings we set are stored in one of three forms.

  • Int, an 8-byte long integer with a maximum value of: 0x7fffffffffffffffL
  • Embstr, a string of less than or equal to 44 bytes
  • Raw

Combine the code to see how Redis makes decisions about these three data structures. When we use the command on the client side SET test hello,redis When the command is received, the client saves the command to a buf, and then executes the commands in the order they are received. One of these functions is.processMultibulkBuffer() , which internally calls the createStringObject() Function.

#define OBJ_ENCODING_EMBSTR_SIZE_LIMIT 44

robj *createStringObject(const char *ptr, size_t len) {
     // Check the length of the saved string and select the corresponding type
    if (len <= OBJ_ENCODING_EMBSTR_SIZE_LIMIT) 
        return createEmbeddedStringObject(ptr,len);
    else
        return createRawStringObject(ptr,len);
}

not understandC Language doesn't matter., Here it is. inspections The string we entered hello,redis The length is Is it more than 44 , If more than the type used raw If not, use embstr . An experimental look at.

127.0.0.1:6379> SET test 12345678901234567890123456789012345678901234 // len=44
OK
127.0.0.1:6379> OBJECT encoding test
"embstr"
127.0.0.1:6379> SET test 123456789012345678901234567890123456789012345 // len=45
OK
127.0.0.1:6379> OBJECT encoding test
"raw"

It can be seen that once 44 is exceeded, the underlying type becomes.raw . Wait, didn't we also mention above that there is a int Type? I can't see it at all from the function, huh? There's no rush, when this command we've entered is actually going to start executing, which is when the function is called setCommand() When it does, it triggers a tryObjectEncoding() function, which attempts to compress the input string, continue with the code at

robj *tryObjectEncoding(robj *o) {
    ... ...
     len = sdslen(s);       // Less than or equal to 20 in length and capable of being converted into a long shape
    if(len <= 20 && string2l(s,len,&value)) {
        o->encoding = OBJ_ENCODING_INT;
    }
    ... ...
}

This function has been drastically scaled down by me, but simply we are able to see that it determines if the length is less than or equal to 20 and tries to convert to an integer, look at the example.

9223372036854775807 is the largest integer that can be represented by an 8-bit byte, and its hexadecimal form is0x7fffffffffffffffL

127.0.0.1:6379> SET test 9223372036854775807
OK
127.0.0.1:6379> OBJECT encoding test
"int"
127.0.0.1:6379> SET test 9223372036854775808 //  Bigger than above1
OK
127.0.0.1:6379> OBJECT encoding test
"embstr"

so far, with respect toString The type selection process is complete for the。 The value of this reference for us is, we (located) at use String type When saving data, To take into account that the underlying layer corresponds to a different type, Different types (located) atRedis Different processes will be implemented internally, Its corresponding implementation efficiency、 Memory consumption is all different。

Hash type

We often use it to hold a structured piece of data, such as cached information associated with a user. If you use a normal String type, you need to serialize and deserialize the string, which undoubtedly adds extra overhead and only reads it all out each time.

  • Cache structured data, such as article information, with the flexibility to modify one of its fields, such as the number of reads.

The Hash type holds structured data, much like a record in MySQL, where we can easily modify a particular field, but it is more flexible, with each record able to contain different fields.

internal structure

Two types of data structures may exist for internal Hash type data.

  • ZipList, more space-saving, restrictions: key and field length not more than 64, the number of fields in the key not more than 512
  • HashTable

as far as sth is concernedHash,Redis First set it by default to use ZipList data structure, with subsequent judgments about whether changes are needed based on conditions.

void hsetCommand(client *c) {
    int update;
    robj *o;
    if ((o = hashTypeLookupWriteOrCreate(c,c->argv[1])) == NULL) return;
    hashTypeTryConversion(o,c->argv,2,3);//  Decision based on length
    ... ...
    update = hashTypeSet(o,c->argv[2],c->argv[3]);//  Decision making based on the number of elements
    addReply(c, update ? shared.czero : shared.cone);
    ... ...
}

hashTypeLookupWriteOrCreate() Internally, it calls createHashObject() Creates a Hash object.

robj *createHashObject(void) {
    unsigned char *zl = ziplistNew();
    robj *o = createObject(OBJ_HASH, zl);
    o->encoding = OBJ_ENCODING_ZIPLIST;//  Set the code ziplist
    return o;
}

hashTypeTryConversion() The function is internally based on whether it exceeds the hash_max_ziplist_value The length of the limit (64) to determine the data structure of the lower level.

void hashTypeTryConversion(robj *o, robj **argv, int start, int end) {
        int i;
        if (o->encoding != OBJ_ENCODING_ZIPLIST) return;
        for (i = start; i <= end; i++) {
            //  inspections field  together with value  The length is super-long
            if (sdsEncodedObject(argv[i]) &&
                sdslen(argv[i]->ptr) > server.hash_max_ziplist_value)
            {
                hashTypeConvert(o, OBJ_ENCODING_HT);
                break;
            }
        }
}

then (afterwards) (located) at function hashTypeSet() Check if the number of fields in the hash_max_ziplist_entries limit (512).

int hashTypeSet(robj *o, robj *field, robj *value) {
    int update = 0;
    if (o->encoding == OBJ_ENCODING_ZIPLIST) {
        ... ...
        //  inspectionsfield Does the number of individuals exceed512
        if (hashTypeLength(o) > server.hash_max_ziplist_entries)
            hashTypeConvert(o, OBJ_ENCODING_HT);
        } else if (o->encoding == OBJ_ENCODING_HT) {
          ... ...
    }
    ... ...
    return update;
}

To verify the logic above.

127.0.0.1:6379> HSET test name qweqweqwkejkksdjfslfldsjfkldjslkfqweqweqwkejkksdjfslfldsjfkldjsl
(integer) 1
127.0.0.1:6379> HSTRLEN test name
(integer) 64
127.0.0.1:6379> OBJECT encoding test
"ziplist"
127.0.0.1:6379> HSET test name qweqweqwkejkksdjfslfldsjfkldjslkfqweqweqwkejkksdjfslfldsjfkldjslq
(integer) 0
127.0.0.1:6379> HSTRLEN test name
(integer) 65
127.0.0.1:6379> OBJECT encoding test
"hashtable"

For the case of key setting over 64, and the limit of number of fields over 512, you can test it by yourself.

List type

List type The use of the, The main outline of common scenarios:

  • Message queue: LPUSH + BRPOP (blocking feature)
  • Caching: users record various records, the most important feature is that it can support paging
  • Stack: LPUSH + LPOP
  • Queue: LPUSH + RPOP
  • Finite queue: LPUSH + LTRIM, which maintains the amount of data in the queue

internal structure

There are several low-level implementations of the List data type.

  • QuickList: it is a LinkedList with ZipList as node
  • ZipList (memory saving). (located) at3.2.12 There are places found in the version use
  • LinkedList, (located) at3.2.12 There are places found in the version use

Some articles on the web say LinkedList (located) at Redis 4.0 Subsequent versions have not been use, I actually found that Redis 3.2.12 The structure is also no longer used in the version (not directly as a data storage structure), including ZipList (located) at 3.2.12 None of the versions are being used to store data directly anymore.

Let's do an experiment to verify this, we set up a List with 1000 elements, each of which has a value length of more than 64 six characters。

127.0.0.1:6379> LLEN test
(integer) 1000
127.0.0.1:6379> OBJECT encoding test
"quicklist"
127.0.0.1:6379> LINDEX test 0
"qweqweqwkejkksdjfslfldsjfkldjslkfqweqweqwkejkksdjfslfldsjfkldjslq" // 65 six characters

Whether we change the number of elements of the list and the length of the element values, the structure is QuickList . If you still don't believe me, let's look at the code.

void pushGenericCommand(client *c, int where) {
    int j, waiting = 0, pushed = 0;
    robj *lobj = lookupKeyWrite(c->db,c->argv[1]);
    ... ...
    for (j = 2; j < c->argc; j++) {
        c->argv[j] = tryObjectEncoding(c->argv[j]);
        if (!lobj) {
            //  establish quick list
            lobj = createQuicklistObject();
            quicklistSetOptions(lobj->ptr, server.list_max_ziplist_size,
                                server.list_compress_depth);
            dbAdd(c->db,c->argv[1],lobj);
        }
        listTypePush(lobj,c->argv[j],where);
        pushed++;
    }
    ... ...
}

At the initial word, the call to createQuicklistObject() Set its low-level data structure to.quick list . There is no place in the subsequent process for further transformation of this structure.

Set type

One of the important features of the Set type is that it can be de-duplicated and unordered. The nature of its collection can have a wide range of uses in social situations.

  • Common concern
  • common preference
  • Data de-duplication

internal structure

The Set low-level implementation uses two data structures.

  • IntSet, The set members are all integers( Cannot exceed the maximum integer) and the number of members of the set is less than512 time use。
  • HashTable

The code for this command is as follows, with the two important calls regarding the decision type being.setTypeCreate() harmony setTypeAdd()

void saddCommand(client *c) {
    robj *set;
    ... ...
    if (set == NULL) {
         // Initialization
        set = setTypeCreate(c->argv[2]);
    } else {
        ... ...
    }
    for (j = 2; j < c->argc; j++) {
         // Internally, it checks if the number of elements is expanded enough to require a change to the low-level structure
        if (setTypeAdd(set,c->argv[j])) added++;
    }
    ... ...
}

Take a look at the initial creation code for the Set structure object.

robj *setTypeCreate(robj *value) {
    if (isObjectRepresentableAsLongLong(value,NULL) == C_OK)
        return createIntsetObject(); //  useIntSet
    return createSetObject(); //  useHashTable
}

isObjectRepresentableAsLongLong() Internal determination of its integer range, If it is an integer and does not exceed the maximum integer it will use IntSet to save. Otherwise use HashTable . Then the number of elements will be checked.

int setTypeAdd(robj *subject, robj *value) {
    long long llval;
    if (subject->encoding == OBJ_ENCODING_HT) {
         ... ...
    } else if (subject->encoding == OBJ_ENCODING_INTSET) {
        if (isObjectRepresentableAsLongLong(value,&llval) == C_OK) {
            uint8_t success = 0;
            subject->ptr = intsetAdd(subject->ptr,llval,&success);
            if (success) {
                /* Convert to regular set when the intset contains
                 * too many entries. */
                if (intsetLen(subject->ptr) > server.set_max_intset_entries)
                    setTypeConvert(subject,OBJ_ENCODING_HT); 
                return 1;
            }
        } else {
            /* Failed to get integer from object, convert to regular set. */
            setTypeConvert(subject,OBJ_ENCODING_HT);
            ... ...
            return 1;
        }
    }
   ... ...
   return 0;
}

Look at the example, here with the maximum integer critical value.

127.0.0.1:6379> SADD test 9223372036854775807
(integer) 1
127.0.0.1:6379> OBJECT encoding test
"intset"
127.0.0.1:6379> SADD test 9223372036854775808
(integer) 1
127.0.0.1:6379> OBJECT encoding test
"hashtable"

For tests on the number of sets, please complete your own observations.

SortSet type

Nowadays, apps have some kind of leaderboard or something like that, such as investment sites showing the ranking of investment amounts, shopping sites showing the ranking of spending, etc. SortSet is perfect for doing this. Commonly used to address the following issues.

  • Rankings by category
  • Set the execution task weights, and the background script executes the relevant actions according to their sort order
  • Range lookup to find which range of a set a value is in

internal structure

Although an ordered set is also a collection, the lower-level data structure is different from Set, which also has two data structures, namely.

  • ZipList, used when the number of elements in an ordered collection is less than or equal to 128 or the length of a member is less than or equal to 64
  • SkipList

This conversion process is as follows.

void zaddGenericCommand(client *c, int flags) {
    if (zobj == NULL) {
        if (xx) goto reply_to_client; /* No key + XX option: nothing to do. */
        if (server.zset_max_ziplist_entries == 0 ||
            server.zset_max_ziplist_value < sdslen(c->argv[scoreidx+1]->ptr))
        {
            zobj = createZsetObject();// skip list
        } else {
            zobj = createZsetZiplistObject();// zip list
        }
        dbAdd(c->db,key,zobj);
    } else {
        ... ...
    }
    ... ...
    if (zobj->encoding == OBJ_ENCODING_ZIPLIST) {
        if (zzlLength(zobj->ptr) > server.zset_max_ziplist_entries)
            zsetConvert(zobj,OBJ_ENCODING_SKIPLIST);//  Coding based on number of conversions
        if (sdslen(ele->ptr) > server.zset_max_ziplist_value)
            zsetConvert(zobj,OBJ_ENCODING_SKIPLIST);//  Conversion of codes according to length
    }
}

Here is an example with member length over 64.

127.0.0.1:6379> ZADD test 77 qwertyuiopqwertyuiopqwertyuiopqwertyuiopqwertyuiopqwertyuiopqwer // member The length is 64
(integer) 1
127.0.0.1:6379> OBJECT encoding test
"ziplist"
127.0.0.1:6379> ZADD test 77 qwertyuiopqwertyuiopqwertyuiopqwertyuiopqwertyuiopqwertyuiopqwerq // member The length is65
(integer) 1
127.0.0.1:6379> OBJECT encoding test
"skiplist"

When we member exceeds 64 bits in length, the lower-level data structure is defined by the ZipList Turned into SkipList . Try moving your hands for the rest of the test for the number of elements.

Global Common Commands

For global commands, operations are possible regardless of what type of data the corresponding key is. Among the things to note KEYS This command, which cannot be used online because of the Redis single-threaded mechanism, will operate with severe blocking if there is too much data in memory, causing the entire Redis service to be unresponsive.

conclude

  • Each type of Redis command has a different time complexity, some related to the number of corresponding elements; some related to the number of requests; and
  • Rationalize the number of elements and their lengths, aiming for the simplest data structure for the Redis base.
  • Pay attention to time complexity, know their Redis internal element profile and avoid blocking.
  • The simpler the data, the better the performance.
  • Redis Each data type low-level corresponds to multiple data structures, modify together with Extensions are not perceptible to the upper layers。

The first one talked about why you should use Redis, and this article talks about the vast majority of commands again, right, and some of the implementations of them in the Redis source code, and then starts to focus on some of the operations in concrete practice. I hope this is helpful to you all. Looking forward to any kind of criticism and encouragement


Recommended>>
1、JSCushioned MotionCoupled Suspension Frame
2、30 seconds to evaluate a web page
3、Industry AI Imaging track opens how Meitu is making a name for itself in artificial intelligence
4、The current state of the fishing offensedefense matchup today the defensive side is completely behind and beaten
5、Im asking if youre afraid The worlds number one private equity fund Bridgewater officially enters China to cut leeks with detailed explanation of investment strategy

    已推荐到看一看 和朋友分享想法
    最多200字,当前共 发送

    已发送

    朋友将在看一看看到

    确定
    分享你的想法...
    取消

    分享想法到看一看

    确定
    最多200字,当前共

    发送中

    网络异常,请稍后重试

    微信扫一扫
    关注该公众号