PostgreSQL is a solid datbase and functions very well under heavy load, it's critical to understand the load on your paticular system. There are many possible settings that can be tweaked to improve the peformance of PostgreSQL, and the sytem in general. A subset of those settings are mentioned here, along with pgbench and how it can be used to assist with tuning. It should be noted that there is no "magic bullet" here, performance is a concept and the system may require multiple tunings as it grows.
Database processes do complicated operations on lots of data. Complicated operations (like comparing, grouping, hashing, joining, sorting, etc) require lots of processor time. Give your database lots of processor juice, dual processor minimum, high end of the price/performance ration (most GHz for least USD). Lots of data requires lots of storage, lots of operations means lots of reading and writing. This means 36 GB (minimum) SCSI on 180 or 320 bus; use 15K RPM if possible. Of course storing lots of data means, operating on lots of data. Operations take place in memory so there had better be enough, swapping will degrade performance. At the time of this writing a solid database machine for moderate load was a dual 3.0 GHz, 4GB RAM and hardware mirrored 73GB/SCSI320/15K.
Databases have a tendency to store mission critical data, so one needs to take precautions when storing. The first line of defence (after quality hardware vendors) is the storage array or RAID. The reader is expected to know what RAID0, RAID1 and RAID5 are. RAID0 is simply not suitable, do not use this for a database filesystem. RAID1 is is favoured by many over RAID5 for the datbase system, the arguments go something like: parallel writes vs read+parity-calc+write. The options that many consider the best is RAID10 which has the reliabilty of RAID1 with performance like RAID0. RAID10 highly reccomended even though it is the most expensive option.
The choice of performance hardware will be wasted if the software cannot or will not use it. To this end the operating system should have some performance tuning done to it if possible, an entire discussion itself. PostgreSQL is highly configurable and has many options to improve its performance, obviously consuming resources to do so. Of course hardware, OS and RDBMS tuning will be useless if the database schema and design are not also performance oriented.
All this relates to Linux, not specific to any distribution.
ipcsto prove it, the more shared memory the better as more data (tables) can be loaded. On a dedicated datbase server it's not uncommon to give half the memory to the database. The shared memory maximum value can be set in
echo $((`cat /proc/sys/kernel/shmmax` / 1048576))to see how many megabytes are permitted for shared memory. Set this to half of the physical memory (or some reasonable value)
echo $((1024 * 1024 * 1024)) > /proc/sys/kernel/shmmax, thats 1G. PostgreSQL will also have to be told to use this, see below.
/etc/fstabshould look something like:
/dev/sda4 /var ext3 noatime 0 0
The values and suggestions expressed here are just that, it's basic rules that Edoceo uses, YMMV. As always one should experiment with what values work best for your environment.
syslog_facility = 'LOCAL2' syslog_ident = 'postgres' log_connections = false log_disconnections = false log_duration = false log_statement = 'none'
Performance is also tied to the schema and design of the system and to the data that is stored. Clearly larger data sets will take longer, even with high-performance hardware and software. Keep in mind that varchar(50) primary keys with unique indexes will not perform as well as char(10). Both will be slower than integer based primary keys. Don't index columns which are never searched on, carefully choose the columns that are indexed.