Large INSERTs performance optimization
I have 15 Amazon AWS EC2 t1.micro instances which simultaneously populate Amazon RDS MySQL d2.m2.xlarge database with data using large INSERTs (40000 rows in query).
The queries are sent continuously. The table is INNODB, two INT columns, there is index for both columns. CPU utilization of RDS instance is about 30% during data receiving.
When I have one EC2 instance, the speed is in orders of magnitude faster than I run 15 instances simultaneously. In light of this, the 15-instances group works slower and slower until the speed becomes totally unsatisfactory.
How can I optimize performance for this process?
UPDATE
My SHOW CREATE TABLE
result is the following:
CREATE TABLE `UserData` (
`uid` int(11) NOT NULL,
`data` int(11) NOT NULL,
PRIMARY KEY (`uid`,`data`),
KEY `uid` (`uid`),
KEY `data` (`data`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
I need 2 indexes because it is necessary for me to fetch data by uid
and by data
value.
I insert data with INSERT INTO UserData (uid, data) VALUES (1,2),(1,3),(1,10),...
with 40000 (uid,data)
pairs.
15 parallel instances insert ~121 000 000 rows in 2 hours, but I am sure that it can be much more faster.
mysql innodb optimization insert amazon-rds
add a comment |
I have 15 Amazon AWS EC2 t1.micro instances which simultaneously populate Amazon RDS MySQL d2.m2.xlarge database with data using large INSERTs (40000 rows in query).
The queries are sent continuously. The table is INNODB, two INT columns, there is index for both columns. CPU utilization of RDS instance is about 30% during data receiving.
When I have one EC2 instance, the speed is in orders of magnitude faster than I run 15 instances simultaneously. In light of this, the 15-instances group works slower and slower until the speed becomes totally unsatisfactory.
How can I optimize performance for this process?
UPDATE
My SHOW CREATE TABLE
result is the following:
CREATE TABLE `UserData` (
`uid` int(11) NOT NULL,
`data` int(11) NOT NULL,
PRIMARY KEY (`uid`,`data`),
KEY `uid` (`uid`),
KEY `data` (`data`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
I need 2 indexes because it is necessary for me to fetch data by uid
and by data
value.
I insert data with INSERT INTO UserData (uid, data) VALUES (1,2),(1,3),(1,10),...
with 40000 (uid,data)
pairs.
15 parallel instances insert ~121 000 000 rows in 2 hours, but I am sure that it can be much more faster.
mysql innodb optimization insert amazon-rds
Please present theSHOW CREATE TABLE your_table
output. In particular, is there anAUTO_INCREMENT
column? WhichPRIMARY KEY
? What is the order by which rows are inserted?
– Shlomi Noach
May 13 '13 at 5:10
@ShlomiNoach Thank you for reply, I have updated the post
– zavg
May 13 '13 at 12:42
Are your INSERTs enclosed into a transaction?START TRANSACTION;
INSERT
... andCOMMIT;
– Cristian Porta
May 13 '13 at 18:50
@crisrian-porta Not, but I use bulk inserts with 40000 values in one query.
– zavg
May 13 '13 at 19:04
add a comment |
I have 15 Amazon AWS EC2 t1.micro instances which simultaneously populate Amazon RDS MySQL d2.m2.xlarge database with data using large INSERTs (40000 rows in query).
The queries are sent continuously. The table is INNODB, two INT columns, there is index for both columns. CPU utilization of RDS instance is about 30% during data receiving.
When I have one EC2 instance, the speed is in orders of magnitude faster than I run 15 instances simultaneously. In light of this, the 15-instances group works slower and slower until the speed becomes totally unsatisfactory.
How can I optimize performance for this process?
UPDATE
My SHOW CREATE TABLE
result is the following:
CREATE TABLE `UserData` (
`uid` int(11) NOT NULL,
`data` int(11) NOT NULL,
PRIMARY KEY (`uid`,`data`),
KEY `uid` (`uid`),
KEY `data` (`data`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
I need 2 indexes because it is necessary for me to fetch data by uid
and by data
value.
I insert data with INSERT INTO UserData (uid, data) VALUES (1,2),(1,3),(1,10),...
with 40000 (uid,data)
pairs.
15 parallel instances insert ~121 000 000 rows in 2 hours, but I am sure that it can be much more faster.
mysql innodb optimization insert amazon-rds
I have 15 Amazon AWS EC2 t1.micro instances which simultaneously populate Amazon RDS MySQL d2.m2.xlarge database with data using large INSERTs (40000 rows in query).
The queries are sent continuously. The table is INNODB, two INT columns, there is index for both columns. CPU utilization of RDS instance is about 30% during data receiving.
When I have one EC2 instance, the speed is in orders of magnitude faster than I run 15 instances simultaneously. In light of this, the 15-instances group works slower and slower until the speed becomes totally unsatisfactory.
How can I optimize performance for this process?
UPDATE
My SHOW CREATE TABLE
result is the following:
CREATE TABLE `UserData` (
`uid` int(11) NOT NULL,
`data` int(11) NOT NULL,
PRIMARY KEY (`uid`,`data`),
KEY `uid` (`uid`),
KEY `data` (`data`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
I need 2 indexes because it is necessary for me to fetch data by uid
and by data
value.
I insert data with INSERT INTO UserData (uid, data) VALUES (1,2),(1,3),(1,10),...
with 40000 (uid,data)
pairs.
15 parallel instances insert ~121 000 000 rows in 2 hours, but I am sure that it can be much more faster.
mysql innodb optimization insert amazon-rds
mysql innodb optimization insert amazon-rds
edited 6 mins ago
zavg
asked May 13 '13 at 0:09
zavgzavg
121126
121126
Please present theSHOW CREATE TABLE your_table
output. In particular, is there anAUTO_INCREMENT
column? WhichPRIMARY KEY
? What is the order by which rows are inserted?
– Shlomi Noach
May 13 '13 at 5:10
@ShlomiNoach Thank you for reply, I have updated the post
– zavg
May 13 '13 at 12:42
Are your INSERTs enclosed into a transaction?START TRANSACTION;
INSERT
... andCOMMIT;
– Cristian Porta
May 13 '13 at 18:50
@crisrian-porta Not, but I use bulk inserts with 40000 values in one query.
– zavg
May 13 '13 at 19:04
add a comment |
Please present theSHOW CREATE TABLE your_table
output. In particular, is there anAUTO_INCREMENT
column? WhichPRIMARY KEY
? What is the order by which rows are inserted?
– Shlomi Noach
May 13 '13 at 5:10
@ShlomiNoach Thank you for reply, I have updated the post
– zavg
May 13 '13 at 12:42
Are your INSERTs enclosed into a transaction?START TRANSACTION;
INSERT
... andCOMMIT;
– Cristian Porta
May 13 '13 at 18:50
@crisrian-porta Not, but I use bulk inserts with 40000 values in one query.
– zavg
May 13 '13 at 19:04
Please present the
SHOW CREATE TABLE your_table
output. In particular, is there an AUTO_INCREMENT
column? Which PRIMARY KEY
? What is the order by which rows are inserted?– Shlomi Noach
May 13 '13 at 5:10
Please present the
SHOW CREATE TABLE your_table
output. In particular, is there an AUTO_INCREMENT
column? Which PRIMARY KEY
? What is the order by which rows are inserted?– Shlomi Noach
May 13 '13 at 5:10
@ShlomiNoach Thank you for reply, I have updated the post
– zavg
May 13 '13 at 12:42
@ShlomiNoach Thank you for reply, I have updated the post
– zavg
May 13 '13 at 12:42
Are your INSERTs enclosed into a transaction?
START TRANSACTION;
INSERT
... and COMMIT;
– Cristian Porta
May 13 '13 at 18:50
Are your INSERTs enclosed into a transaction?
START TRANSACTION;
INSERT
... and COMMIT;
– Cristian Porta
May 13 '13 at 18:50
@crisrian-porta Not, but I use bulk inserts with 40000 values in one query.
– zavg
May 13 '13 at 19:04
@crisrian-porta Not, but I use bulk inserts with 40000 values in one query.
– zavg
May 13 '13 at 19:04
add a comment |
1 Answer
1
active
oldest
votes
Two hints for you:
The
KEY uid
is redundant, because it is covered by thePRIMARY KEY
40,000
rows at a time might make for too large a transaction. Although these are very small rows (twoINT
s) this may cause the transaction to go to disk, depending on your settings. I usually go with around1,000
rows at a time (I go as low as100
and as high as10,000
).
Please try doing 40 * 1,000 and see if this works better for you than 1 * 40,000
Ok, thank you for advice. I will try your technique and measure the performance increase. Currently I am using such optimizations: 1) I have dropped all indexes while writing to db, 2) I have set AUTOCOMMIT=0; and COMMIT after very large sequence of INSERTs (about 1500000 rows), 3) I am using increased innodb_buffer_pool_size and bulk_insert_buffer_size, 4) also I changed t1.micro instances to m1.small and m2.xlarge to m1.xlarge for better IO performance
– zavg
May 17 '13 at 21:09
number 2) sound like an anti-optimization to me. And how do you mean you've dropped all indexes while writing? Do you mean you'reALTER
ing the table before and after theINSERT
? If so - don't do it.
– Shlomi Noach
May 19 '13 at 17:00
I insert data to the table without indexes and add them after all INSERTs. then I don't write to this table.
– zavg
May 19 '13 at 22:52
add a comment |
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "182"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f42018%2flarge-inserts-performance-optimization%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
Two hints for you:
The
KEY uid
is redundant, because it is covered by thePRIMARY KEY
40,000
rows at a time might make for too large a transaction. Although these are very small rows (twoINT
s) this may cause the transaction to go to disk, depending on your settings. I usually go with around1,000
rows at a time (I go as low as100
and as high as10,000
).
Please try doing 40 * 1,000 and see if this works better for you than 1 * 40,000
Ok, thank you for advice. I will try your technique and measure the performance increase. Currently I am using such optimizations: 1) I have dropped all indexes while writing to db, 2) I have set AUTOCOMMIT=0; and COMMIT after very large sequence of INSERTs (about 1500000 rows), 3) I am using increased innodb_buffer_pool_size and bulk_insert_buffer_size, 4) also I changed t1.micro instances to m1.small and m2.xlarge to m1.xlarge for better IO performance
– zavg
May 17 '13 at 21:09
number 2) sound like an anti-optimization to me. And how do you mean you've dropped all indexes while writing? Do you mean you'reALTER
ing the table before and after theINSERT
? If so - don't do it.
– Shlomi Noach
May 19 '13 at 17:00
I insert data to the table without indexes and add them after all INSERTs. then I don't write to this table.
– zavg
May 19 '13 at 22:52
add a comment |
Two hints for you:
The
KEY uid
is redundant, because it is covered by thePRIMARY KEY
40,000
rows at a time might make for too large a transaction. Although these are very small rows (twoINT
s) this may cause the transaction to go to disk, depending on your settings. I usually go with around1,000
rows at a time (I go as low as100
and as high as10,000
).
Please try doing 40 * 1,000 and see if this works better for you than 1 * 40,000
Ok, thank you for advice. I will try your technique and measure the performance increase. Currently I am using such optimizations: 1) I have dropped all indexes while writing to db, 2) I have set AUTOCOMMIT=0; and COMMIT after very large sequence of INSERTs (about 1500000 rows), 3) I am using increased innodb_buffer_pool_size and bulk_insert_buffer_size, 4) also I changed t1.micro instances to m1.small and m2.xlarge to m1.xlarge for better IO performance
– zavg
May 17 '13 at 21:09
number 2) sound like an anti-optimization to me. And how do you mean you've dropped all indexes while writing? Do you mean you'reALTER
ing the table before and after theINSERT
? If so - don't do it.
– Shlomi Noach
May 19 '13 at 17:00
I insert data to the table without indexes and add them after all INSERTs. then I don't write to this table.
– zavg
May 19 '13 at 22:52
add a comment |
Two hints for you:
The
KEY uid
is redundant, because it is covered by thePRIMARY KEY
40,000
rows at a time might make for too large a transaction. Although these are very small rows (twoINT
s) this may cause the transaction to go to disk, depending on your settings. I usually go with around1,000
rows at a time (I go as low as100
and as high as10,000
).
Please try doing 40 * 1,000 and see if this works better for you than 1 * 40,000
Two hints for you:
The
KEY uid
is redundant, because it is covered by thePRIMARY KEY
40,000
rows at a time might make for too large a transaction. Although these are very small rows (twoINT
s) this may cause the transaction to go to disk, depending on your settings. I usually go with around1,000
rows at a time (I go as low as100
and as high as10,000
).
Please try doing 40 * 1,000 and see if this works better for you than 1 * 40,000
answered May 17 '13 at 19:54
Shlomi NoachShlomi Noach
6,2771721
6,2771721
Ok, thank you for advice. I will try your technique and measure the performance increase. Currently I am using such optimizations: 1) I have dropped all indexes while writing to db, 2) I have set AUTOCOMMIT=0; and COMMIT after very large sequence of INSERTs (about 1500000 rows), 3) I am using increased innodb_buffer_pool_size and bulk_insert_buffer_size, 4) also I changed t1.micro instances to m1.small and m2.xlarge to m1.xlarge for better IO performance
– zavg
May 17 '13 at 21:09
number 2) sound like an anti-optimization to me. And how do you mean you've dropped all indexes while writing? Do you mean you'reALTER
ing the table before and after theINSERT
? If so - don't do it.
– Shlomi Noach
May 19 '13 at 17:00
I insert data to the table without indexes and add them after all INSERTs. then I don't write to this table.
– zavg
May 19 '13 at 22:52
add a comment |
Ok, thank you for advice. I will try your technique and measure the performance increase. Currently I am using such optimizations: 1) I have dropped all indexes while writing to db, 2) I have set AUTOCOMMIT=0; and COMMIT after very large sequence of INSERTs (about 1500000 rows), 3) I am using increased innodb_buffer_pool_size and bulk_insert_buffer_size, 4) also I changed t1.micro instances to m1.small and m2.xlarge to m1.xlarge for better IO performance
– zavg
May 17 '13 at 21:09
number 2) sound like an anti-optimization to me. And how do you mean you've dropped all indexes while writing? Do you mean you'reALTER
ing the table before and after theINSERT
? If so - don't do it.
– Shlomi Noach
May 19 '13 at 17:00
I insert data to the table without indexes and add them after all INSERTs. then I don't write to this table.
– zavg
May 19 '13 at 22:52
Ok, thank you for advice. I will try your technique and measure the performance increase. Currently I am using such optimizations: 1) I have dropped all indexes while writing to db, 2) I have set AUTOCOMMIT=0; and COMMIT after very large sequence of INSERTs (about 1500000 rows), 3) I am using increased innodb_buffer_pool_size and bulk_insert_buffer_size, 4) also I changed t1.micro instances to m1.small and m2.xlarge to m1.xlarge for better IO performance
– zavg
May 17 '13 at 21:09
Ok, thank you for advice. I will try your technique and measure the performance increase. Currently I am using such optimizations: 1) I have dropped all indexes while writing to db, 2) I have set AUTOCOMMIT=0; and COMMIT after very large sequence of INSERTs (about 1500000 rows), 3) I am using increased innodb_buffer_pool_size and bulk_insert_buffer_size, 4) also I changed t1.micro instances to m1.small and m2.xlarge to m1.xlarge for better IO performance
– zavg
May 17 '13 at 21:09
number 2) sound like an anti-optimization to me. And how do you mean you've dropped all indexes while writing? Do you mean you're
ALTER
ing the table before and after the INSERT
? If so - don't do it.– Shlomi Noach
May 19 '13 at 17:00
number 2) sound like an anti-optimization to me. And how do you mean you've dropped all indexes while writing? Do you mean you're
ALTER
ing the table before and after the INSERT
? If so - don't do it.– Shlomi Noach
May 19 '13 at 17:00
I insert data to the table without indexes and add them after all INSERTs. then I don't write to this table.
– zavg
May 19 '13 at 22:52
I insert data to the table without indexes and add them after all INSERTs. then I don't write to this table.
– zavg
May 19 '13 at 22:52
add a comment |
Thanks for contributing an answer to Database Administrators Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f42018%2flarge-inserts-performance-optimization%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Please present the
SHOW CREATE TABLE your_table
output. In particular, is there anAUTO_INCREMENT
column? WhichPRIMARY KEY
? What is the order by which rows are inserted?– Shlomi Noach
May 13 '13 at 5:10
@ShlomiNoach Thank you for reply, I have updated the post
– zavg
May 13 '13 at 12:42
Are your INSERTs enclosed into a transaction?
START TRANSACTION;
INSERT
... andCOMMIT;
– Cristian Porta
May 13 '13 at 18:50
@crisrian-porta Not, but I use bulk inserts with 40000 values in one query.
– zavg
May 13 '13 at 19:04