oid vs bytea in Postgresql
Okay, here is the deal, I am working on a project which requires me to store files in my database. But I have recently just switched to postgresql from mysql, so I had to do some research on what methods postgresql offers for file storage. In mysql GLOB would have been my choice, but postgresql offers two methods, BYTEA which is postgresql’s answer to GLOB, or store the files in Large Objects (oid).
I asked around to find out what method would be the fastes, but I could not find the answer I was looking for, so I ended up spending the night testing it for myself, this blog will show the results.
The scripts
byteaTest_save.php
ini_set('max_execution_time', '0');
ini_set('memory_limit', '128M');
$time = microtime();
$time = explode(' ', $time);
$time = $time[1] + $time[0];
$begintime = $time;
/* bytea test
DB:
CREATE TABLE images
(
id serial NOT NULL,
"name" text,
image bytea,
CONSTRAINT images_pkey PRIMARY KEY (id)
)
WITH (OIDS=FALSE);
ALTER TABLE images OWNER TO "storeTest";
*/
function saveBytea($Conn, $Filename)
{
global $Dir;
$FileContent = pg_escape_bytea(file_get_contents($Dir.'/'.$Filename));
$Query = "INSERT INTO images (name, image) VALUES ('".$Filename."', '".$FileContent."');";
pg_query($Conn, $Query);
unset($FileContent);
}
$strConn = "host=localhost dbname=byteaTest user=storeTest password=test sslmode=require";
$Conn = pg_connect($strConn);
global $Dir;
$Dir = "/mnt/hdb/Pictures/BU billeder";
$Handle = opendir($Dir);
while(false !== ($File = readdir($Handle)))
{
if($File !== "." && $File !== "..")
{
saveBytea($Conn, $File);
}
}
closedir($Handle);
pg_close($Conn);
$time = microtime();
$time = explode(" ", $time);
$time = $time[1] + $time[0];
$endtime = $time;
$totaltime = ($endtime - $begintime);
$totaltime = sprintf("%.4f", $totaltime);
$Memory = memory_get_peak_usage();
echo "Execution time: " .$totaltime. " seconds.
";
echo "Peak memory usage: " . $Memory/1024/1024 . " mb.";
byteaTest_load.php
ini_set('max_execution_time', '0');
ini_set('memory_limit', '128M');
$time = microtime();
$time = explode(' ', $time);
$time = $time[1] + $time[0];
$begintime = $time;
/* bytea test
DB:
CREATE TABLE images
(
id serial NOT NULL,
"name" text,
image bytea,
CONSTRAINT images_pkey PRIMARY KEY (id)
)
WITH (OIDS=FALSE);
ALTER TABLE images OWNER TO "storeTest";
*/
function loadBytea($Conn, $ID)
{
global $Dir;
$Query = pg_query($Conn, "SELECT name, image FROM images WHERE id=".$ID.";");
$Row = pg_fetch_row($Query);
$FileContent = pg_unescape_bytea($Row[1]);
file_put_contents($Dir.'/'.$Row[0], $FileContent);
unset($FileContent);
}
$strConn = "host=localhost dbname=byteaTest user=storeTest password=test sslmode=require";
$Conn = pg_connect($strConn);
global $Dir;
$Dir = "/mnt/hdb/oidVsbytea";
$Query = pg_query("SELECT id FROM images;") or die(pg_last_error($Conn));
while($Row = pg_fetch_assoc($Query))
{
loadBytea($Conn, $Row['id']);
}
pg_close($Conn);
$time = microtime();
$time = explode(" ", $time);
$time = $time[1] + $time[0];
$endtime = $time;
$totaltime = ($endtime - $begintime);
$totaltime = sprintf("%.4f", $totaltime);
$Memory = memory_get_peak_usage();
echo "Execution time: " .$totaltime. " seconds.\n";
echo "Peak memory usage: " . $Memory/1024/1024 . " mb.";
oidTest_save.php
ini_set('max_execution_time', '0');
ini_set('memory_limit', '128M');
$time = microtime();
$time = explode(' ', $time);
$time = $time[1] + $time[0];
$begintime = $time;
/* oid test
DB:
CREATE TABLE images
(
id serial NOT NULL,
"name" text,
image oid,
CONSTRAINT images_pkey PRIMARY KEY (id)
)
WITH (OIDS=FALSE);
ALTER TABLE images OWNER TO "storeTest";
*/
function saveOid($Conn, $Filename)
{
global $Dir;
$FileContent = file_get_contents($Dir.'/'.$Filename);
pg_query($Conn, "begin;");
$Oid = pg_lo_create($Conn);
$Query = "INSERT INTO images (name, image) VALUES ('".$Filename."', '".$Oid."');";
pg_query($Conn, $Query);
$OidHandle = pg_lo_open($Conn, $Oid, "w");
pg_lo_write($OidHandle, $FileContent);
pg_lo_close($OidHandle);
pg_query($Conn, "commit;");
unset($FileContent);
}
$strConn = "host=localhost dbname=oidTest user=storeTest password=test sslmode=require";
$Conn = pg_connect($strConn);
global $Dir;
$Dir = "/mnt/hdb/Pictures/BU billeder";
$Handle = opendir($Dir);
while(false !== ($File = readdir($Handle)))
{
if($File !== "." && $File !== "..")
{
saveOid($Conn, $File);
}
}
closedir($Handle);
pg_close($Conn);
$time = microtime();
$time = explode(" ", $time);
$time = $time[1] + $time[0];
$endtime = $time;
$totaltime = ($endtime - $begintime);
$totaltime = sprintf("%.4f", $totaltime);
$Memory = memory_get_peak_usage();
echo "Execution time: " .$totaltime. " seconds.\n";
echo "Peak memory usage: " . $Memory/1024/1024 . " mb.";
oidTest_load.php
ini_set('max_execution_time', '0');
ini_set('memory_limit', '128M');
$time = microtime();
$time = explode(' ', $time);
$time = $time[1] + $time[0];
$begintime = $time;
/* bytea test
DB:
CREATE TABLE images
(
id serial NOT NULL,
"name" text,
image bytea,
CONSTRAINT images_pkey PRIMARY KEY (id)
)
WITH (OIDS=FALSE);
ALTER TABLE images OWNER TO "storeTest";
*/
function loadOid($Conn, $ID)
{
global $Dir;
$Query = pg_query($Conn, "SELECT name, image FROM images WHERE id=".$ID.";");
$Row = pg_fetch_row($Query);
pg_query($Conn, "begin;");
$Handle = pg_lo_open($Conn, $Row[1], "r");
$FileContent = pg_lo_read($Handle, 4194304);
pg_query($Conn, "commit;");
file_put_contents($Dir.'/'.$Row[0], $FileContent);
unset($FileContent);
}
$strConn = "host=localhost dbname=oidTest user=storeTest password=test sslmode=require";
$Conn = pg_connect($strConn);
global $Dir;
$Dir = "/mnt/hdb/oidVsbytea";
$Query = pg_query("SELECT id FROM images;") or die(pg_last_error($Conn));
while($Row = pg_fetch_assoc($Query))
{
loadOid($Conn, $Row['id']);
}
pg_close($Conn);
$time = microtime();
$time = explode(" ", $time);
$time = $time[1] + $time[0];
$endtime = $time;
$totaltime = ($endtime - $begintime);
$totaltime = sprintf("%.4f", $totaltime);
$Memory = memory_get_peak_usage();
echo "Execution time: " .$totaltime. " seconds.\n";
echo "Peak memory usage: " . $Memory/1024/1024 . " mb.";
oidTestClear.php
$strConn = "host=localhost dbname=oidTest user=storeTest password=test sslmode=require";
$Conn = pg_connect($strConn);
$Query = pg_query($Conn, "SELECT image FROM images");
while($R = pg_fetch_assoc($Query))
{
pg_lo_unlink($Conn, $R['image']);
}
NB: These scripts might / or might not be optimized, if you got any changes that might give a better result, please leave a comment.
The test
Every script was run five times, everytime with a clean database and a database restart. The oidTest took abit longer, since I had to unlink every oid that was created and then clear the database (oidTestClear.php).
The scripts was using a dir I have with 1075 pictures, filesizes is from 6kb to 3724kb.
Server hardware:
AMD Athlon XP 2200+ (Clocked to 1.62Ghz)
256Mb Ram
The results
| 1 | 2 | 3 | 4 | 5 | Average | Memory | |
|---|---|---|---|---|---|---|---|
| bytea – save | 402,79s | 379,63s | 380,25s | 378,39s | 377,82s | 383,78s | 41,21MB |
| bytea – load | 228,42s | 228,09s | 227,85s | 228,35s | 228,48s | 228,24s | 18,51MB |
| oid – save | 114,17s | 133,75s | 121,21s | 138,75s | 127,59s | 127,09s | 3,7MB |
| oid – load | 45,53s | 47,45s | 45,58s | 46,97s | 46,6s | 46,43s | 7,7MB |
Conclusion
From my tests, I most say I am suprised, I was expecting that bytea would be a little slower then oid, since it’s data had to be escaped before inserting it to the database, but I was not expecting it to be 3-4 times slower then oid.
Another thing I realized was that bytea creates a huge overhead while processing the data, 41,21MB vs 3,7MB.
From my point of view, oid is at all time the best choice if data storage in a postgresql database is a requirement, bytea is just too slow and memory hungry.
Notes
Bytea might have an advantage over oid, bytea is, from what I know, easier to maintaine and handle, since it’s data is stored inside of the table, instead of in separate place in the database. But then again, if you lose your table with oid’s, your data would still be present, this is not the case with bytea.

(6 votes, average: 4,83 out of 5)
november 28th, 2008 at 11:14
very helpful post, I’m facing the same problem , Thanks!
august 21st, 2010 at 13:22
Does delete from a row from your table images generate a delete cascade to the pg_largeobject table associated to the deleted oid?
Thanks
december 5th, 2010 at 21:57
This is most likely one #of the# best article that ever cross my reference. I do not see why anybody should disagree. It may be too easy #for them# to comprehend…anyway nice work i’m coming again right here for More Great Stuff!!
december 12th, 2010 at 19:57
Helpful blog. Totally agree with you.
marts 13th, 2011 at 22:28
Interesting, really useful post for me
juni 19th, 2011 at 10:44
I am now not certain where you’re getting your information, however great topic. I must spend some time learning much more or understanding more. Thanks for great information I was searching for this info for my mission.