How to import CSV file data into a PostgreSQL table?

Question

How can I write a stored procedure that imports data from a CSV file and populates the table?

I have a user interface that uploads the csv file, to hook up this i need the stored procedure that actually copies the data from the cvs file — vardhan, Jun 7 '10 at 6:58
Bozhidar Batsov already gave you a link to an example, the fine manual could also help: postgresql.org/docs/8.4/interactive/sql-copy.html — Frank Heikens, Jun 7 '10 at 7:11
Current manual: postgresql.org/docs/current/static/sql-copy.html — Basil Bourque, Aug 9 '14 at 20:37

Bozhidar Batsov · Answer 1 · 2013-05-07 13:10:13Z

up vote 533 down vote

Take a look at this short article.

Solution paraphrased here:

Create your table:

CREATE TABLE zip_codes 
(ZIP char(5), LATITUDE double precision, LONGITUDE double precision, 
CITY varchar, STATE char(2), COUNTY varchar, ZIP_CLASS varchar);

Copy data from your CSV file to the table:

COPY zip_codes FROM '/path/to/csv/ZIP_CODES.txt' DELIMITER ',' CSV;

edited May 7 '13 at 13:10

answered Jun 7 '10 at 6:24

Bozhidar Batsov

39.5k1171101

34

actually use \copy would do the same trick if you do not have the super user access; it complaints on my Fedora 16 when using COPY with a non-root account. – asksw0rder Oct 15 '12 at 17:07

59

TIP: you can indicate what columns you have in the CSV using zip_codes(col1, col2, col3). The columns must be listed in the same order that they appear in the file. – David Pelaez Jan 2 '13 at 5:16

1

@asksw0rder does \copy have the same syntax? bcoz I'm getting a syntax error with \copy – JhovaniC May 29 '13 at 19:59

4

Should I include the header row? – bernie2436 Oct 27 '13 at 23:09

64

You can easily include the header row -- just add HEADER in the options: COPY zip_codes FROM '/path/to/csv/ZIP_CODES.txt' DELIMITER ',' CSV HEADER; postgresql.org/docs/9.1/static/sql-copy.html – Barrett Clark Nov 8 '13 at 15:17

| show 5 more comments

bjelli · Answer 2 · 2017-01-27 09:02:52Z

up vote 76 down vote

If you don't have permission to use COPY (which work on the db server), you can use \copy instead (which works in the db client). Using the same example as Bozhidar Batsov:

Create your table:

CREATE TABLE zip_codes 
(ZIP char(5), LATITUDE double precision, LONGITUDE double precision, 
CITY varchar, STATE char(2), COUNTY varchar, ZIP_CLASS varchar);

Copy data from your CSV file to the table:

\copy zip_codes FROM '/path/to/csv/ZIP_CODES.txt' DELIMITER ',' CSV

You can also specify the columns to read:

\copy zip_codes(ZIP,CITY,STATE) FROM '/path/to/csv/ZIP_CODES.txt' DELIMITER ',' CSV

edited Jan 27 at 9:02

answered Jun 20 '15 at 7:26

bjelli

5,1681930

\copy voters(ZIP,CITY) FROM '/Users/files/Downloads/WOOD.TXT' DELIMITER ',' CSV HEADER; ERROR: extra data after last expected column CONTEXT: COPY voters, line 2: "OH0012781511,87,26953,HOUSEHOLDER,SHERRY,LEIGH,,11/26/1965,‌08/19/1988,,211 N GARFIELD ST , ,BLOOMD..." – JZ. Sep 6 '15 at 17:29

@JZ. I had a similar error. It was because I had extra blank columns. Check your csv and if you have blank columns, that could be the reason. – alex bennett Jul 8 '16 at 4:32

1

This is somewhat misleading: the difference between COPY and \copy is much more than just permissions, and you can't simply add a `` to make it magically work. See the description (in the context of export) here: stackoverflow.com/a/1517692/157957 – IMSoP Jan 26 at 16:29

@IMSoP: you're right, I added a mention of server and client to clarify – bjelli Jan 27 at 9:03

add a comment |

RobinL · Answer 3 · 2016-02-06 10:01:43Z

One quick way of doing this is with the Python pandas library (version 0.15 or above works best). This will handle creating the columns for you - although obviously the choices it makes for data types might not be what you want. If it doesn't quite do what you want you can always use the 'create table' code generated as a template.

Here's a simple example:

import pandas as pd
df = pd.read_csv('mypath.csv')
df.columns = [c.lower() for c in df.columns] #postgres doesn't like capitals or spaces

from sqlalchemy import create_engine
engine = create_engine('postgresql://username:password@localhost:5432/dbname')

df.to_sql("my_table_name", engine)

And here's some code that shows you how to set various options:

#Set is so the raw sql output is logged
import logging
logging.basicConfig()
logging.getLogger('sqlalchemy.engine').setLevel(logging.INFO)

df.to_sql("my_table_name2", 
          engine, 
          if_exists="append",  #options are ‘fail’, ‘replace’, ‘append’, default ‘fail’
          index=False, #Do not output the index of the dataframe
          dtype={'col1': sqlalchemy.types.NUMERIC,
                 'col2': sqlalchemy.types.String}) #Datatypes should be [sqlalchemy types][1]

In addition, the if_exists parameter can be set to replace or append to an existing table, e.g. df.to_sql("fhrs", engine, if_exists='replace') — Joel Ostblom, Apr 30 '15 at 0:47
username and password : need to create Login and assign DB to user. If uses pgAdmin, then create "Login/Group role" using GUI — Somnath Kadam, Mar 24 at 12:52
Pandas is a super slow way of loading to sql (vs csv files). Can be orders of magnitude slower. — user48956, May 4 at 18:46

Community · Answer 4 · 2017-05-23 11:47:28Z

You could also use pgAdmin, which offers a GUI to do the import. That's shown in this SO thread. The advantage of using pgAdmin is that it also works for remote databases.

Much like the previous solutions though, you would need to have your table on the database already. Each person has his own solution but what I usually do is open the CSV in Excel, copy the headers, paste special with transposition on a different worksheet, place the corresponding data type on the next column then just copy and paste that to a text editor together with the appropriate SQL table creation query like so:

CREATE TABLE my_table (
    /*paste data from Excel here for example ... */
    col_1 bigint,
    col_2 bigint,
    /* ... */
    col_n bigint 
)

Andreas L. · Answer 5 · 2015-10-21 13:37:41Z

up vote 11 down vote

As Paul mentioned, import works in pgAdmin:

right click on table -> import

select local file, format and coding

here is a german pgAdmin GUI screenshot:

similar thing you can do with DbVisualizer (I have a license, not sure about free version)

right click on a table -> Import Table Data...

edited Oct 21 '15 at 13:37

answered Sep 17 '15 at 8:55

Andreas L.

1,2161019

DBVisualizer took 50 seconds to import 1400 rows with three fields -- and I had to cast everything back from a String to whatever it was supposed to be. – Noumenon Sep 29 '16 at 10:46

add a comment |

mehmet · Answer 6 · 2016-11-08 13:04:46Z

Most other solutions here require that you create the table in advance/manually. This may not be practical in some cases (e.g., if you have a lot of columns in the destination table). So, the approach below may come handy.

Providing the "path to" and "the number of columns" in your csv file, you can use the following function to load your table to a temp table:

CREATE OR REPLACE FUNCTION csv_to_temp_table (IN path TEXT, IN col_count INTEGER)

RETURNS VOID AS $$

BEGIN 
    DROP TABLE IF EXISTS temp;

    CREATE TABLE temp ();

    FOR iter IN 1..col_count
    LOOP
        EXECUTE 'ALTER TABLE temp ADD COLUMN col_' || iter || ' VARCHAR;';
    END LOOP;

    EXECUTE FORMAT('COPY temp FROM %L WITH DELIMITER '',''', path);
END;

$$ LANGUAGE plpgsql;

Columns are named as: col_1, col_2, ..., etc.

Furthermore, in most typical scenarios, the top row usually have the column names. To rename the columns as such, you can use the following, a bit more sophisticated version:

create or replace function data.load_csv_file
(
    target_table text,
    csv_path text,
    col_count integer
)

returns void as $$

declare

iter integer; -- dummy integer to iterate columns with
col text; -- variable to keep the column name at each iteration
col_first text; -- first column name, e.g., top left corner on a csv file or spreadsheet

begin
    set schema 'your-schema';

    create table temp_table ();

    -- add just enough number of columns
    for iter in 1..col_count
    loop
        execute format('alter table temp_table add column col_%s text;', iter);
    end loop;

    -- copy the data from csv file
    execute format('copy temp_table from %L with delimiter '','' quote ''"'' csv ', csv_path);

    iter := 1;
    col_first := (select col_1 from temp_table limit 1);

    -- update the column names based on the first row which has the column names
    for col in execute format('select unnest(string_to_array(trim(temp_table::text, ''()''), '','')) from temp_table where col_1 = %L', col_first)
    loop
        execute format('alter table temp_table rename column col_%s to %s', iter, col);
        iter := iter + 1;
    end loop;

    -- delete the columns row
    execute format('delete from temp_table where %s = %L', col_first, col_first);

    -- change the temp table name to the name given as parameter, if not blank
    if length(target_table) > 0 then
        execute format('alter table temp_table rename to %I', target_table);
    end if;

end;

$$ language plpgsql;

Hi Mehmet, thanks for the answer you posted but when I run your code I get the following error message : ERROR: schema "data" does not exist — user2867432, Nov 8 '16 at 5:34
user2867432 you need to change schema name that you use accordingly (e.g., public) — mehmet, Nov 8 '16 at 13:05

Tunaki · Answer 7 · 2016-11-16 19:21:03Z

up vote 6 down vote

COPY table_name FROM 'path/to/data.csv' DELIMITER ',' CSV HEADER;

edited Nov 16 '16 at 19:21

Tunaki

67.6k20115157

answered Nov 16 '16 at 19:06

Tim

310410

add a comment |

alfonx · Answer 8 · 2017-01-18 09:02:59Z

up vote 3 down vote

Use this SQL code

    copy table_name(atribute1,attribute2,attribute3...)
    from 'E:\test.csv' delimiter ',' csv header

the header keyword lets the DBMS know that the csv file have a header with attributes

for more visit http://www.postgresqltutorial.com/import-csv-file-into-posgresql-table/

edited Jan 18 at 9:02

alfonx

3,2783254

answered Jan 12 at 7:27

user7404825

965

add a comment |

sal · Answer 9 · 2017-02-17 12:41:09Z

up vote 1 down vote

IMHO, the most convenient way is to follow "Import CSV data into postgresql, the comfortable way ;-)", using csvsql from csvkit, which is a python package installable via pip.

edited Feb 17 at 12:41

answered Nov 7 '15 at 9:54

sal

4311313

1

Link rot is voracious! The article you linked to no longer works, which makes me uncomfortable :( – chbrown Jul 27 '16 at 20:18

you might want to mention that his is py. – mountainclimber Aug 9 '16 at 14:46

For me I get a MemoryError if trying to import a large CSV so it looks like it doesn't stream. – DavidC Oct 20 '16 at 12:32

@DavidC Interesting. How big is your file? How much memory do you have? If it doesnt stream as it appears, I suggest chunking the data before insertion – sal Oct 31 '16 at 12:13

@DavidC Or you use the csvql command without the --insert option and insert the data later via COPY, or you go by RobinL's answer stackoverflow.com/a/29722393/2772305 – sal Oct 31 '16 at 12:20

| show 1 more comment

Sawan · Answer 10 · 2017-02-21 20:01:54Z

up vote -1 down vote

Maybe this tool be helpful http://www.convertcsv.com/csv-to-sql.htm It have a lot of configuration options to convert from CSV to database script.

answered Feb 21 at 20:01

Sawan

624

add a comment |

asked	6 years, 11 months ago
viewed	444364 times
active	1 month ago

How to import CSV file data into a PostgreSQL table?

10 Answers 10

Your Answer

Not the answer you're looking for? Browse other questions tagged postgresql csv postgresql-copy or ask your own question.

Linked

Hot Network Questions

How to import CSV file data into a PostgreSQL table?

10 Answers 10

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged postgresql csv postgresql-copy or ask your own question.

Linked

Related

Hot Network Questions