- Author:
- briangershon
- Posted:
- August 11, 2009
- Language:
- Python
- Version:
- 1.1
- Score:
- 4 (after 4 ratings)
This script converts a CSV file into a JSON file ready to be imported via manage.py loaddata
like any other fixture data.
It can be used manually to do a one-time conversion (for placing into a /fixtures folder), or used in a fabric script that automatically converts CSV to JSON live then runs loaddata
to import as fixture data.
To run script:
csv2json.py input_file_name model_name
e.g. csv2json.py airport.csv app_airport.Airport
Note: input_file_name should be a path relative to where this script is.
Scripts depends on simplejson module. The module can just be placed in a sub-folder to the script to make it easy to import. If you use the same Python binary that you use for your Django site, you could use the Django import instead: from django.utils import simplejson
File Input/Ouptut formats:
Assumes CSV files are saved with LF line endings, and that first line has field values. First column is the model's pk field.
Sample CSV input:
id,ident,name,city,state
1,00C,Animas Air Park,Durango,CO
6,00V,Meadow Lake,Colorado Springs,CO
7,00W,Lower Granite State,Colfax,WA
12,01J,Hilliard Airpark,Hilliard,FL
Output file name is input name + ".json" extension.
Sample JSON output:
[
{
"pk": 1,
"model": "app_airport.Airport",
"fields": {
"name": "Animas Air Park",
"city": "Durango",
"ident": "00C",
"state": "CO",
}
}
]
Debugging Conversion Problems
If JSON import errors out with "ValidationError: This value must be an integer", you probably have a blank in an Integer field within your CSV file, but if can't figure out, try setting a breakpoint in file:
./django/django/db/models/fields/__init__.py
e.g.
688 try:
689 return int(value)
690 except (TypeError, ValueError):
691 import pdb; pdb.set_trace()
692 -> raise exceptions.ValidationError(
693 _("This value must be an integer."))
To figure out what field caused the error, while in the debugger:
(Pdb) u <- to go UP the callstack
(Pdb) field.name
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 | # csv2json.py
#
# Copyright 2009 Brian Gershon -- briang at webcollective.coop
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import sys
import getopt
import csv
from os.path import dirname
import simplejson
try:
script, input_file_name, model_name = sys.argv
except ValueError:
print "\nRun via:\n\n%s input_file_name model_name" % sys.argv[0]
print "\ne.g. %s airport.csv app_airport.Airport" % sys.argv[0]
print "\nNote: input_file_name should be a path relative to where this script is."
sys.exit()
in_file = dirname(__file__) + input_file_name
out_file = dirname(__file__) + input_file_name + ".json"
print "Converting %s from CSV to JSON as %s" % (in_file, out_file)
f = open(in_file, 'r' )
fo = open(out_file, 'w')
reader = csv.reader( f )
header_row = []
entries = []
# debugging
# if model_name == 'app_airport.Airport':
# import pdb ; pdb.set_trace( )
for row in reader:
if not header_row:
header_row = row
continue
pk = row[0]
model = model_name
fields = {}
for i in range(len(row)-1):
active_field = row[i+1]
# convert numeric strings into actual numbers by converting to either int or float
if active_field.isdigit():
try:
new_number = int(active_field)
except ValueError:
new_number = float(active_field)
fields[header_row[i+1]] = new_number
else:
fields[header_row[i+1]] = active_field.strip()
row_dict = {}
row_dict["pk"] = int(pk)
row_dict["model"] = model_name
row_dict["fields"] = fields
entries.append(row_dict)
fo.write("%s" % simplejson.dumps(entries, indent=4))
f.close()
fo.close()
|
More like this
- Template tag - list punctuation for a list of items by shapiromatron 11 months, 2 weeks ago
- JSONRequestMiddleware adds a .json() method to your HttpRequests by cdcarter 11 months, 3 weeks ago
- Serializer factory with Django Rest Framework by julio 1 year, 6 months ago
- Image compression before saving the new model / work with JPG, PNG by Schleidens 1 year, 7 months ago
- Help text hyperlinks by sa2812 1 year, 8 months ago
Comments
Hi,
Thank you so much for this snippet. I am just learning Django and Python. I've been trying to use this snippet but i am getting an error message:
File "<stdin>", line 1 csv2json.py sw.csv wkw1.Lawyer SyntaxError: invalid syntax
(the invalid sytax caret points to sw.csv)
I created a csv file sw.csv and I put it in the same directory (sw1) as the snippet csv2json.py.
I started the python shell by typing "python" on the command prompt, and I entered csv2json.py sw.csv wkw1.Lawyer, but I got the error message.
What am I doing wrong? Thanks for the help.
#
Please login first before commenting.