Cassandra query performs differently at different times

I am using apache-cassandra-2.0.12 in production, with network topology strategy and ReplicationFactor : 3 in a cluster with 2 DC’s each contains 4 nodes

While analyzing the response time for the read requests, we found out that some queries are performing slower than it actually does.
Eg : Consider the following table

Create ColumnFamily "Employee"

(

  empID bigint,

  uniqueID text,

  col1 text,

  col2 text,

  col3 text,

  primary key (empID,uniqueID)

)

This CF contains data for more that 5K row entires and and each employee contains minimum of 100K columns and the maximum of 1000K columns

So in this CF the response time for the following queries differs drastically from time to time

SELECT * form "Employee" where empID = xxx and uniqueID = 'value';

Some times the response time for the above query is more than 3 sec, whereas it should actually take within 50 milliseconds

I have monitored the load (compaction time, disk utilizations etc ), CPU and the memory of the nodes at that time . All these params were normal.

Is there anything that I have missed or is this the normal behavior of cassandra ?

Note: I don't have any tombstone columns in this CF

edited May 4 '16 at 7:11

asked May 3 '16 at 5:42

Shanmugaapriyan p

112

bumped to the homepage by Community♦ 3 mins ago

This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.

Your Table does not match your key. Could you update it with the correct schema?

– RussS
May 3 '16 at 15:53

What consistency level is used for your reads?

– Castaglia
May 3 '16 at 17:45

@RussS Thank you .Changed my schema.

– Shanmugaapriyan p
May 4 '16 at 7:13

@Castaglia I am using local_quorum to read the data .

– Shanmugaapriyan p
May 4 '16 at 7:13

Try with TRACING ON

– undefined_variable
May 5 '16 at 7:40

|
show 4 more comments

I am using apache-cassandra-2.0.12 in production, with network topology strategy and ReplicationFactor : 3 in a cluster with 2 DC’s each contains 4 nodes

While analyzing the response time for the read requests, we found out that some queries are performing slower than it actually does.
Eg : Consider the following table

Create ColumnFamily "Employee"

(

  empID bigint,

  uniqueID text,

  col1 text,

  col2 text,

  col3 text,

  primary key (empID,uniqueID)

)

This CF contains data for more that 5K row entires and and each employee contains minimum of 100K columns and the maximum of 1000K columns

So in this CF the response time for the following queries differs drastically from time to time

SELECT * form "Employee" where empID = xxx and uniqueID = 'value';

Some times the response time for the above query is more than 3 sec, whereas it should actually take within 50 milliseconds

I have monitored the load (compaction time, disk utilizations etc ), CPU and the memory of the nodes at that time . All these params were normal.

Is there anything that I have missed or is this the normal behavior of cassandra ?

Note: I don't have any tombstone columns in this CF

edited May 4 '16 at 7:11

asked May 3 '16 at 5:42

Shanmugaapriyan p

112

bumped to the homepage by Community♦ 3 mins ago

This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.

Your Table does not match your key. Could you update it with the correct schema?

– RussS
May 3 '16 at 15:53

What consistency level is used for your reads?

– Castaglia
May 3 '16 at 17:45

@RussS Thank you .Changed my schema.

– Shanmugaapriyan p
May 4 '16 at 7:13

@Castaglia I am using local_quorum to read the data .

– Shanmugaapriyan p
May 4 '16 at 7:13

Try with TRACING ON

– undefined_variable
May 5 '16 at 7:40

|
show 4 more comments

I am using apache-cassandra-2.0.12 in production, with network topology strategy and ReplicationFactor : 3 in a cluster with 2 DC’s each contains 4 nodes

While analyzing the response time for the read requests, we found out that some queries are performing slower than it actually does.
Eg : Consider the following table

Create ColumnFamily "Employee"

(

  empID bigint,

  uniqueID text,

  col1 text,

  col2 text,

  col3 text,

  primary key (empID,uniqueID)

)

This CF contains data for more that 5K row entires and and each employee contains minimum of 100K columns and the maximum of 1000K columns

So in this CF the response time for the following queries differs drastically from time to time

SELECT * form "Employee" where empID = xxx and uniqueID = 'value';

Some times the response time for the above query is more than 3 sec, whereas it should actually take within 50 milliseconds

I have monitored the load (compaction time, disk utilizations etc ), CPU and the memory of the nodes at that time . All these params were normal.

Is there anything that I have missed or is this the normal behavior of cassandra ?

Note: I don't have any tombstone columns in this CF

edited May 4 '16 at 7:11

asked May 3 '16 at 5:42

Shanmugaapriyan p

112

I am using apache-cassandra-2.0.12 in production, with network topology strategy and ReplicationFactor : 3 in a cluster with 2 DC’s each contains 4 nodes

While analyzing the response time for the read requests, we found out that some queries are performing slower than it actually does.
Eg : Consider the following table

Create ColumnFamily "Employee"

(

  empID bigint,

  uniqueID text,

  col1 text,

  col2 text,

  col3 text,

  primary key (empID,uniqueID)

)

This CF contains data for more that 5K row entires and and each employee contains minimum of 100K columns and the maximum of 1000K columns

So in this CF the response time for the following queries differs drastically from time to time

SELECT * form "Employee" where empID = xxx and uniqueID = 'value';

Some times the response time for the above query is more than 3 sec, whereas it should actually take within 50 milliseconds

I have monitored the load (compaction time, disk utilizations etc ), CPU and the memory of the nodes at that time . All these params were normal.

Is there anything that I have missed or is this the normal behavior of cassandra ?

Note: I don't have any tombstone columns in this CF

cassandra

edited May 4 '16 at 7:11

asked May 3 '16 at 5:42

Shanmugaapriyan p

112

edited May 4 '16 at 7:11

asked May 3 '16 at 5:42

Shanmugaapriyan p

112

edited May 4 '16 at 7:11

asked May 3 '16 at 5:42

Shanmugaapriyan p

112

asked May 3 '16 at 5:42

Shanmugaapriyan p

112

asked May 3 '16 at 5:42

Shanmugaapriyan p

112

bumped to the homepage by Community♦ 3 mins ago

This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.

bumped to the homepage by Community♦ 3 mins ago

This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.

Your Table does not match your key. Could you update it with the correct schema?

– RussS
May 3 '16 at 15:53

What consistency level is used for your reads?

– Castaglia
May 3 '16 at 17:45

@RussS Thank you .Changed my schema.

– Shanmugaapriyan p
May 4 '16 at 7:13

@Castaglia I am using local_quorum to read the data .

– Shanmugaapriyan p
May 4 '16 at 7:13

Try with TRACING ON

– undefined_variable
May 5 '16 at 7:40

|
show 4 more comments

Your Table does not match your key. Could you update it with the correct schema?

– RussS
May 3 '16 at 15:53

What consistency level is used for your reads?

– Castaglia
May 3 '16 at 17:45

@RussS Thank you .Changed my schema.

– Shanmugaapriyan p
May 4 '16 at 7:13

@Castaglia I am using local_quorum to read the data .

– Shanmugaapriyan p
May 4 '16 at 7:13

Try with TRACING ON

– undefined_variable
May 5 '16 at 7:40

Your Table does not match your key. Could you update it with the correct schema?

– RussS
May 3 '16 at 15:53

What consistency level is used for your reads?

– Castaglia
May 3 '16 at 17:45

@RussS Thank you .Changed my schema.

– Shanmugaapriyan p
May 4 '16 at 7:13

@Castaglia I am using local_quorum to read the data .

– Shanmugaapriyan p
May 4 '16 at 7:13

Try with TRACING ON

– undefined_variable
May 5 '16 at 7:40

|
show 4 more comments

2 Answers
2

active

oldest

votes

If it hits the data node as master node, and search is only the other 2 nodes, so response time is lower.

If a request hits the non data node as master node, it needs to get data from 3 nodes in case of a replication factor of 3, so response time will be higher.

CL=local_quorum

edited Dec 8 '18 at 17:04

Glorfindel

9291815

answered Dec 8 '18 at 5:11

ravi tammana

First of all, there is no "master" node in Cassandra. Secondly, querying at LOCAL_QUORUM with a RF of 3 will pull data from 2 nodes, not 3.

– Aaron
Dec 13 '18 at 15:01

add a comment |

primary key (empID,uniqueID)

This CF contains data for more that 5K row entires and and each employee contains minimum of 100K columns and the maximum of 1000K columns

That's WAY too many rows per partition. My guess is that the query slowdowns happen when large partitions are queried. It all depends on data cell value size and data width, but as a general rule, I would not model more than 10k-30k rows per partition.

To test this out, you could run nodetool tablehistorgrams on your table to gauge things like max cell count and partition size. Then run your query against both small and large partitions with TRACING ON, and I'm sure you'll see the difference.

Basically, try reworking your model for smaller partitions.

answered Dec 13 '18 at 15:14

Aaron

2,60211227

add a comment |

Your Answer

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "182"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f137324%2fcassandra-query-performs-differently-at-different-times%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

CL=local_quorum

edited Dec 8 '18 at 17:04

Glorfindel

9291815

answered Dec 8 '18 at 5:11

ravi tammana

First of all, there is no "master" node in Cassandra. Secondly, querying at LOCAL_QUORUM with a RF of 3 will pull data from 2 nodes, not 3.

– Aaron
Dec 13 '18 at 15:01

add a comment |

CL=local_quorum

edited Dec 8 '18 at 17:04

Glorfindel

9291815

answered Dec 8 '18 at 5:11

ravi tammana

First of all, there is no "master" node in Cassandra. Secondly, querying at LOCAL_QUORUM with a RF of 3 will pull data from 2 nodes, not 3.

– Aaron
Dec 13 '18 at 15:01

add a comment |

CL=local_quorum

edited Dec 8 '18 at 17:04

Glorfindel

9291815

answered Dec 8 '18 at 5:11

ravi tammana

CL=local_quorum

edited Dec 8 '18 at 17:04

Glorfindel

9291815

answered Dec 8 '18 at 5:11

ravi tammana

edited Dec 8 '18 at 17:04

Glorfindel

9291815

edited Dec 8 '18 at 17:04

Glorfindel

9291815

edited Dec 8 '18 at 17:04

Glorfindel

9291815

answered Dec 8 '18 at 5:11

ravi tammana

answered Dec 8 '18 at 5:11

ravi tammana

answered Dec 8 '18 at 5:11

ravi tammana

First of all, there is no "master" node in Cassandra. Secondly, querying at LOCAL_QUORUM with a RF of 3 will pull data from 2 nodes, not 3.

– Aaron
Dec 13 '18 at 15:01

add a comment |

First of all, there is no "master" node in Cassandra. Secondly, querying at LOCAL_QUORUM with a RF of 3 will pull data from 2 nodes, not 3.

– Aaron
Dec 13 '18 at 15:01

First of all, there is no "master" node in Cassandra. Secondly, querying at LOCAL_QUORUM with a RF of 3 will pull data from 2 nodes, not 3.

– Aaron
Dec 13 '18 at 15:01

add a comment |

primary key (empID,uniqueID)

This CF contains data for more that 5K row entires and and each employee contains minimum of 100K columns and the maximum of 1000K columns

Basically, try reworking your model for smaller partitions.

answered Dec 13 '18 at 15:14

Aaron

2,60211227

add a comment |

primary key (empID,uniqueID)

This CF contains data for more that 5K row entires and and each employee contains minimum of 100K columns and the maximum of 1000K columns

Basically, try reworking your model for smaller partitions.

answered Dec 13 '18 at 15:14

Aaron

2,60211227

add a comment |

primary key (empID,uniqueID)

This CF contains data for more that 5K row entires and and each employee contains minimum of 100K columns and the maximum of 1000K columns

Basically, try reworking your model for smaller partitions.

answered Dec 13 '18 at 15:14

Aaron

2,60211227

primary key (empID,uniqueID)

This CF contains data for more that 5K row entires and and each employee contains minimum of 100K columns and the maximum of 1000K columns

Basically, try reworking your model for smaller partitions.

answered Dec 13 '18 at 15:14

Aaron

2,60211227

answered Dec 13 '18 at 15:14

Aaron

2,60211227

answered Dec 13 '18 at 15:14

Aaron

2,60211227

answered Dec 13 '18 at 15:14

Aaron

2,60211227

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Database Administrators Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

W,2BMKXTe,2oyt0,N0 8l4UV eY9p4HCTKJg,h,3cS53pb76KIHCJ3YZSMm,FE

搜尋此網誌

Xrhrft