Mar 24, 2022
When you are working on some big project, you will probably need to test speed of your queries, and for that you will need to have a lot of fake records in your database.
You can do this with database seeders, but it can be really slow if you need to seed couple millions of records to database.
In this article I will show you better and faster way for doing this.
Let's first setup our project. It will be a really basic app with one table and one column.
Create fresh Laravel project, at the time of writing this article latest version of Laravel framework is 9.5.1.
composer create-project laravel/laravel --prefer-dist seed-milions
Create a model "Data" with migration, factory and seeder.
php artisan make:model Data -mfs
In migration file add new field "code" which will be string.
public function up()
{
Schema::create('data', function (Blueprint $table) {
$table->id();
$table->string('code');
$table->timestamps();
});
}
Run migrations.
php artisan migrate
As last step, let's create a factory to insert 10 random characters for "code" field.
public function definition()
{
return [
'code' => \Illuminate\Support\Str::random(10)
];
}
That's it. We have now our project ready.
For this case, our goal will be to seed 10 million records to our table "data".
We will first seed 1.000 records to see how much time it takes and based on that we will calculate how much time it would take to seed 10 million records.
Let's define it in the DataSeeder class. We will call factory method on our model and pass to it number of records we want to seed.
public function run()
{
\App\Models\Data::factory(1000)->create();
}
And we will call our seeder class in the DatabaseSeeder class.
public function run()
{
$this->call(DataSeeder::class);
}
Let's run our seeder and see how long it take to seed 1.000 records to database.
php artisan db:seed
Output:
Seeding: Database\Seeders\DataSeeder
Seeded: Database\Seeders\DataSeeder (10,329.51ms)
As we can see, it took ~10s to seed 1.000 records to database. So if we calculate how much time it would take this approach to seed 10 million records it will be:
10,32 sec x 10.000 = 103.200 sec
103.200 sec = 1720 min
1720 min = 28,66 hours
So, to seed 10 million records to database with this approach will take us ~28 hour.
Let's try another approach. We will first create array with data we want to insert to database. After that, we will split that array into chunks and for each of those chunks we will call "insert" method on our model.
Make changes in our DataSeeder class.
public function run()
{
for ($i=0; $i < 1000; $i++) {
$data[] = [
'code' => \Illuminate\Support\Str::random(10)
];
}
$chunks = array_chunk($data, 500);
foreach ($chunks as $chunk) {
\App\Models\Data::insert($chunk);
}
}
Let's run again our seeder to see how much time it takes with this approach to seed 1.000 records to database.
php artisan db:seed
Output:
Seeding: Database\Seeders\DataSeeder
Seeded: Database\Seeders\DataSeeder (70.97ms)
As you can see, now it takes ~71ms to seed 1.000 records to database.
If we calculate how much time it will take to seed 10 million records, we get:
0.071 sec x 10.000 = 710 sec
710 sec = 11,83 min
It will take around ~12 minutes to seed same amount of data.
But this can be even faster if we get bigger chunks of data. Let's try to seed 100.000 records with chunks of 50.000.
Output:
Seeding: Database\Seeders\DataSeeder
Seeded: Database\Seeders\DataSeeder (1,978.38ms)
If we calculate how much time it will take to seed 10 million records, we get:
1.97 sec x 100 = 197 sec
230 sec = 3,28 min
So, if we compare these results, we see that for 10 million records in a standard way of seeding it will take ~28 hours and in this way ~3 minutes.
Of course, I do not suggesting to seed 10 million at once, separate it in a couple of runs (maybe 1 million per run?) and you will probably need to play a little with number of chunks to see what works fastest.
That's it. You can now seed your database with a million of records in a couple of minutes.