Cassandra query performs differently at different times












2















I am using apache-cassandra-2.0.12 in production, with network topology strategy and ReplicationFactor : 3 in a cluster with 2 DC’s each contains 4 nodes



While analyzing the response time for the read requests, we found out that some queries are performing slower than it actually does.
Eg : Consider the following table



Create ColumnFamily "Employee"
(
empID bigint,
uniqueID text,
col1 text,
col2 text,
col3 text,
primary key (empID,uniqueID)
)


This CF contains data for more that 5K row entires and and each employee contains minimum of 100K columns and the maximum of 1000K columns



So in this CF the response time for the following queries differs drastically from time to time



SELECT * form "Employee" where empID = xxx and uniqueID = 'value';


Some times the response time for the above query is more than 3 sec, whereas it should actually take within 50 milliseconds



I have monitored the load (compaction time, disk utilizations etc ), CPU and the memory of the nodes at that time . All these params were normal.



Is there anything that I have missed or is this the normal behavior of cassandra ?



Note: I don't have any tombstone columns in this CF










share|improve this question
















bumped to the homepage by Community 3 mins ago


This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
















  • Your Table does not match your key. Could you update it with the correct schema?

    – RussS
    May 3 '16 at 15:53











  • What consistency level is used for your reads?

    – Castaglia
    May 3 '16 at 17:45











  • @RussS Thank you .Changed my schema.

    – Shanmugaapriyan p
    May 4 '16 at 7:13













  • @Castaglia I am using local_quorum to read the data .

    – Shanmugaapriyan p
    May 4 '16 at 7:13











  • Try with TRACING ON

    – undefined_variable
    May 5 '16 at 7:40
















2















I am using apache-cassandra-2.0.12 in production, with network topology strategy and ReplicationFactor : 3 in a cluster with 2 DC’s each contains 4 nodes



While analyzing the response time for the read requests, we found out that some queries are performing slower than it actually does.
Eg : Consider the following table



Create ColumnFamily "Employee"
(
empID bigint,
uniqueID text,
col1 text,
col2 text,
col3 text,
primary key (empID,uniqueID)
)


This CF contains data for more that 5K row entires and and each employee contains minimum of 100K columns and the maximum of 1000K columns



So in this CF the response time for the following queries differs drastically from time to time



SELECT * form "Employee" where empID = xxx and uniqueID = 'value';


Some times the response time for the above query is more than 3 sec, whereas it should actually take within 50 milliseconds



I have monitored the load (compaction time, disk utilizations etc ), CPU and the memory of the nodes at that time . All these params were normal.



Is there anything that I have missed or is this the normal behavior of cassandra ?



Note: I don't have any tombstone columns in this CF










share|improve this question
















bumped to the homepage by Community 3 mins ago


This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
















  • Your Table does not match your key. Could you update it with the correct schema?

    – RussS
    May 3 '16 at 15:53











  • What consistency level is used for your reads?

    – Castaglia
    May 3 '16 at 17:45











  • @RussS Thank you .Changed my schema.

    – Shanmugaapriyan p
    May 4 '16 at 7:13













  • @Castaglia I am using local_quorum to read the data .

    – Shanmugaapriyan p
    May 4 '16 at 7:13











  • Try with TRACING ON

    – undefined_variable
    May 5 '16 at 7:40














2












2








2








I am using apache-cassandra-2.0.12 in production, with network topology strategy and ReplicationFactor : 3 in a cluster with 2 DC’s each contains 4 nodes



While analyzing the response time for the read requests, we found out that some queries are performing slower than it actually does.
Eg : Consider the following table



Create ColumnFamily "Employee"
(
empID bigint,
uniqueID text,
col1 text,
col2 text,
col3 text,
primary key (empID,uniqueID)
)


This CF contains data for more that 5K row entires and and each employee contains minimum of 100K columns and the maximum of 1000K columns



So in this CF the response time for the following queries differs drastically from time to time



SELECT * form "Employee" where empID = xxx and uniqueID = 'value';


Some times the response time for the above query is more than 3 sec, whereas it should actually take within 50 milliseconds



I have monitored the load (compaction time, disk utilizations etc ), CPU and the memory of the nodes at that time . All these params were normal.



Is there anything that I have missed or is this the normal behavior of cassandra ?



Note: I don't have any tombstone columns in this CF










share|improve this question
















I am using apache-cassandra-2.0.12 in production, with network topology strategy and ReplicationFactor : 3 in a cluster with 2 DC’s each contains 4 nodes



While analyzing the response time for the read requests, we found out that some queries are performing slower than it actually does.
Eg : Consider the following table



Create ColumnFamily "Employee"
(
empID bigint,
uniqueID text,
col1 text,
col2 text,
col3 text,
primary key (empID,uniqueID)
)


This CF contains data for more that 5K row entires and and each employee contains minimum of 100K columns and the maximum of 1000K columns



So in this CF the response time for the following queries differs drastically from time to time



SELECT * form "Employee" where empID = xxx and uniqueID = 'value';


Some times the response time for the above query is more than 3 sec, whereas it should actually take within 50 milliseconds



I have monitored the load (compaction time, disk utilizations etc ), CPU and the memory of the nodes at that time . All these params were normal.



Is there anything that I have missed or is this the normal behavior of cassandra ?



Note: I don't have any tombstone columns in this CF







cassandra






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited May 4 '16 at 7:11







Shanmugaapriyan p

















asked May 3 '16 at 5:42









Shanmugaapriyan pShanmugaapriyan p

112




112





bumped to the homepage by Community 3 mins ago


This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.







bumped to the homepage by Community 3 mins ago


This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.















  • Your Table does not match your key. Could you update it with the correct schema?

    – RussS
    May 3 '16 at 15:53











  • What consistency level is used for your reads?

    – Castaglia
    May 3 '16 at 17:45











  • @RussS Thank you .Changed my schema.

    – Shanmugaapriyan p
    May 4 '16 at 7:13













  • @Castaglia I am using local_quorum to read the data .

    – Shanmugaapriyan p
    May 4 '16 at 7:13











  • Try with TRACING ON

    – undefined_variable
    May 5 '16 at 7:40



















  • Your Table does not match your key. Could you update it with the correct schema?

    – RussS
    May 3 '16 at 15:53











  • What consistency level is used for your reads?

    – Castaglia
    May 3 '16 at 17:45











  • @RussS Thank you .Changed my schema.

    – Shanmugaapriyan p
    May 4 '16 at 7:13













  • @Castaglia I am using local_quorum to read the data .

    – Shanmugaapriyan p
    May 4 '16 at 7:13











  • Try with TRACING ON

    – undefined_variable
    May 5 '16 at 7:40

















Your Table does not match your key. Could you update it with the correct schema?

– RussS
May 3 '16 at 15:53





Your Table does not match your key. Could you update it with the correct schema?

– RussS
May 3 '16 at 15:53













What consistency level is used for your reads?

– Castaglia
May 3 '16 at 17:45





What consistency level is used for your reads?

– Castaglia
May 3 '16 at 17:45













@RussS Thank you .Changed my schema.

– Shanmugaapriyan p
May 4 '16 at 7:13







@RussS Thank you .Changed my schema.

– Shanmugaapriyan p
May 4 '16 at 7:13















@Castaglia I am using local_quorum to read the data .

– Shanmugaapriyan p
May 4 '16 at 7:13





@Castaglia I am using local_quorum to read the data .

– Shanmugaapriyan p
May 4 '16 at 7:13













Try with TRACING ON

– undefined_variable
May 5 '16 at 7:40





Try with TRACING ON

– undefined_variable
May 5 '16 at 7:40










2 Answers
2






active

oldest

votes


















0














If it hits the data node as master node, and search is only the other 2 nodes, so response time is lower.

If a request hits the non data node as master node, it needs to get data from 3 nodes in case of a replication factor of 3, so response time will be higher.



CL=local_quorum






share|improve this answer


























  • First of all, there is no "master" node in Cassandra. Secondly, querying at LOCAL_QUORUM with a RF of 3 will pull data from 2 nodes, not 3.

    – Aaron
    Dec 13 '18 at 15:01



















0














primary key (empID,uniqueID)



This CF contains data for more that 5K row entires and and each employee contains minimum of 100K columns and the maximum of 1000K columns




That's WAY too many rows per partition. My guess is that the query slowdowns happen when large partitions are queried. It all depends on data cell value size and data width, but as a general rule, I would not model more than 10k-30k rows per partition.



To test this out, you could run nodetool tablehistorgrams on your table to gauge things like max cell count and partition size. Then run your query against both small and large partitions with TRACING ON, and I'm sure you'll see the difference.



Basically, try reworking your model for smaller partitions.






share|improve this answer























    Your Answer








    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "182"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f137324%2fcassandra-query-performs-differently-at-different-times%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    0














    If it hits the data node as master node, and search is only the other 2 nodes, so response time is lower.

    If a request hits the non data node as master node, it needs to get data from 3 nodes in case of a replication factor of 3, so response time will be higher.



    CL=local_quorum






    share|improve this answer


























    • First of all, there is no "master" node in Cassandra. Secondly, querying at LOCAL_QUORUM with a RF of 3 will pull data from 2 nodes, not 3.

      – Aaron
      Dec 13 '18 at 15:01
















    0














    If it hits the data node as master node, and search is only the other 2 nodes, so response time is lower.

    If a request hits the non data node as master node, it needs to get data from 3 nodes in case of a replication factor of 3, so response time will be higher.



    CL=local_quorum






    share|improve this answer


























    • First of all, there is no "master" node in Cassandra. Secondly, querying at LOCAL_QUORUM with a RF of 3 will pull data from 2 nodes, not 3.

      – Aaron
      Dec 13 '18 at 15:01














    0












    0








    0







    If it hits the data node as master node, and search is only the other 2 nodes, so response time is lower.

    If a request hits the non data node as master node, it needs to get data from 3 nodes in case of a replication factor of 3, so response time will be higher.



    CL=local_quorum






    share|improve this answer















    If it hits the data node as master node, and search is only the other 2 nodes, so response time is lower.

    If a request hits the non data node as master node, it needs to get data from 3 nodes in case of a replication factor of 3, so response time will be higher.



    CL=local_quorum







    share|improve this answer














    share|improve this answer



    share|improve this answer








    edited Dec 8 '18 at 17:04









    Glorfindel

    9291815




    9291815










    answered Dec 8 '18 at 5:11









    ravi tammanaravi tammana

    1




    1













    • First of all, there is no "master" node in Cassandra. Secondly, querying at LOCAL_QUORUM with a RF of 3 will pull data from 2 nodes, not 3.

      – Aaron
      Dec 13 '18 at 15:01



















    • First of all, there is no "master" node in Cassandra. Secondly, querying at LOCAL_QUORUM with a RF of 3 will pull data from 2 nodes, not 3.

      – Aaron
      Dec 13 '18 at 15:01

















    First of all, there is no "master" node in Cassandra. Secondly, querying at LOCAL_QUORUM with a RF of 3 will pull data from 2 nodes, not 3.

    – Aaron
    Dec 13 '18 at 15:01





    First of all, there is no "master" node in Cassandra. Secondly, querying at LOCAL_QUORUM with a RF of 3 will pull data from 2 nodes, not 3.

    – Aaron
    Dec 13 '18 at 15:01













    0














    primary key (empID,uniqueID)



    This CF contains data for more that 5K row entires and and each employee contains minimum of 100K columns and the maximum of 1000K columns




    That's WAY too many rows per partition. My guess is that the query slowdowns happen when large partitions are queried. It all depends on data cell value size and data width, but as a general rule, I would not model more than 10k-30k rows per partition.



    To test this out, you could run nodetool tablehistorgrams on your table to gauge things like max cell count and partition size. Then run your query against both small and large partitions with TRACING ON, and I'm sure you'll see the difference.



    Basically, try reworking your model for smaller partitions.






    share|improve this answer




























      0














      primary key (empID,uniqueID)



      This CF contains data for more that 5K row entires and and each employee contains minimum of 100K columns and the maximum of 1000K columns




      That's WAY too many rows per partition. My guess is that the query slowdowns happen when large partitions are queried. It all depends on data cell value size and data width, but as a general rule, I would not model more than 10k-30k rows per partition.



      To test this out, you could run nodetool tablehistorgrams on your table to gauge things like max cell count and partition size. Then run your query against both small and large partitions with TRACING ON, and I'm sure you'll see the difference.



      Basically, try reworking your model for smaller partitions.






      share|improve this answer


























        0












        0








        0







        primary key (empID,uniqueID)



        This CF contains data for more that 5K row entires and and each employee contains minimum of 100K columns and the maximum of 1000K columns




        That's WAY too many rows per partition. My guess is that the query slowdowns happen when large partitions are queried. It all depends on data cell value size and data width, but as a general rule, I would not model more than 10k-30k rows per partition.



        To test this out, you could run nodetool tablehistorgrams on your table to gauge things like max cell count and partition size. Then run your query against both small and large partitions with TRACING ON, and I'm sure you'll see the difference.



        Basically, try reworking your model for smaller partitions.






        share|improve this answer













        primary key (empID,uniqueID)



        This CF contains data for more that 5K row entires and and each employee contains minimum of 100K columns and the maximum of 1000K columns




        That's WAY too many rows per partition. My guess is that the query slowdowns happen when large partitions are queried. It all depends on data cell value size and data width, but as a general rule, I would not model more than 10k-30k rows per partition.



        To test this out, you could run nodetool tablehistorgrams on your table to gauge things like max cell count and partition size. Then run your query against both small and large partitions with TRACING ON, and I'm sure you'll see the difference.



        Basically, try reworking your model for smaller partitions.







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Dec 13 '18 at 15:14









        AaronAaron

        2,60211227




        2,60211227






























            draft saved

            draft discarded




















































            Thanks for contributing an answer to Database Administrators Stack Exchange!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f137324%2fcassandra-query-performs-differently-at-different-times%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            SQL Server 17 - Attemping to backup to remote NAS but Access is denied

            Always On Availability groups resolving state after failover - Remote harden of transaction...

            Restoring from pg_dump with foreign key constraints