How To Extract Specific Data From Csv File In Python

How to extract certain csv data based on the header in python

How would I extract specific data from a csv file, based on the header in python?
For example, say the csv file contained this information:
Height, Weight, Age
6. 0, 78, 25
How could I retrieve just the age in python?
asked Apr 16 ’13 at 17:08
I second the csv recommendation, but I think here using csv. DictReader would be simpler:
(Python 2):
>>> import csv
>>> with open(“”, “rb”) as fp:… reader = csv. DictReader(fp)… data = next(reader)…
>>> data
{‘Age’: ’25’, ‘Weight’: ’78’, ‘Height’: ‘6. 0′}
>>> data[“Age”]
’25’
>>> float(data[“Age”])
25. 0
Here I’ve used next just to get the first row, but you could loop over the rows and/or extract a full column of information if you liked.
answered Apr 16 ’13 at 17:19
DSMDSM303k57 gold badges536 silver badges452 bronze badges
1
The process to follow is: read in the first line, find the index (location) on that line of the data you’re looking for, then use that index to pull the data out of the remaining lines.
Python offers a very helpful class for doing all the reading, so it’s quite simple.
import csv
filename = ‘yourfilenamehere’
column = ‘Age’
data = [] # This will contain our data
# Create a csv reader object to iterate through the file
reader = ( open( filename, ‘rU’), delimiter=’, ‘, dialect=’excel’)
hrow = () # Get the top row
idx = (column) # Find the column of the data you’re looking for
for row in reader: # Iterate the remaining rows
( row[idx])
print data
Note that the values will come out as strings. You can convert to int by wrapping the row[idx] e. g. ( int( row[idx]))
answered Apr 16 ’13 at 17:15
mfitzpmfitzp13. 7k5 gold badges44 silver badges65 bronze badges
Not the answer you’re looking for? Browse other questions tagged python csv or ask your own question.
Extract, Transform, and Save CSV data • fredgibbs.net

Extract, Transform, and Save CSV data • fredgibbs.net

Sometimes you’ll have a CSV file that contains lots of useful information, but where some of the information isn’t exactly in the form that you need. Moreover, it is often useful to extract a subset of information from a large and complex file to a separate file that you use for other experimental purposes.
This tutorial explains how to extract place names from a CSV file, clean them up a bit, and save them to a regular text file using python.
I have a CSV file of civil war battles that looks like this:
Battle, Other Names, Location 1
Fort Sumter,, ‘Charleston County, SC’
Sewell’s Point,, ‘Norfolk City, VA’
Aquia Creek,, ‘Stafford County, VA’
Philippi, Philippi Races, ‘Barbour County, WV’
Big Bethel, ‘Bethel Church, Great Bethel’, ‘York County, VA’
Ultimately, I want to map all of these battle sites, but I first need to geolocate all of the locations as listed in the CSV file. Rather than trying to geolocate from the places in the original CSV file (which is quite long), I want to isolate the places in a separate file so I can work with just those. Python makes this a cinch.
The first thing to do is to open my original CSV and read it. A standard way of opening files for reading (hence the “r” below) is like so:
inputfile = open(”, ‘r’)
This is not useful in itself, so let’s loop through all the lines in that file and print them, just to make sure we can do something with them.
for row in inputfile:
print row
This of course prints out our original CSV file. Now, we really just want to extract our place, which we could do in any number of ways. Fortunately, Python makes it very easy to read and write CSV files that can do a lot of hard work for us. Let’s use the csv module, which we can import at the beginning of the file, and use to read in the CSV file.
import csv
inputfile = (open(”, ‘r’))
If we inspect the output of this file, we can see that it looks like
[‘Battle’, ‘Other Names’, ‘Location 1’]
[‘Fort Sumter’, ”, ‘Charleston County, SC’]
[‘Sewell’s Point’, ”, ‘Norfolk City, VA’]
[‘Aquia Creek’, ”, ‘Stafford County, VA’]
[‘Philippi’, ‘Pilippi Races’, ‘Barbour County, WV’]
[‘Big Bethel’, ‘Bethel Church, Great Bethel’, ‘York County, VA’]
This is very convenient because the method that we called has automatically converted each row of the file into a Python list. This makes it easy to access particular elements of the CSV file. If we use the usual Python syntax for accessing an element of the list–here our location is the 3rd column, but computers always count starting with 0–so row[2] should give us our locations. We can print just the locations as we did the entire lines of the CSV file.
print row[2]
This yields:
Location 1
Charleston County, SC
Norfolk City, VA
Stafford County, VA
Barbour County, WV
York County, VA
Progress. But it’s annoying that the original data has inconsistencies, like the space (or not) before the comma. Problems like this are quite common. It’s easy to fix them. Let’s just find every instance of a space and a comma together (‘, ’) and replace it with a singe comma (‘, ’).
We can use the replace method that is built into string objects in Python, which is used like:
place(X, Y)
In our case, X is the literal string of a space and comma; Y is the literal string of only a comma:
place(‘, ‘, ‘, ‘)
Integrating this into our code, we have:
place = row[2]. replace(‘, ‘, ‘, ‘)
print place
So far so good. But just printing the locations is not that helpful, though it is an easy way for us to see that things are working so far. Let’s write the locations to a file instead. We only need to add two lines of code: one to open the file for writing, and one to actually write the location.
outputfile = open(”, ‘w’)
(place+’\n’)
Notice that we are not opening the output file with the csv module, just with regular Python because we aren’t making a CSV file, just a text file. Also notice that we want to append a newline character “\n” to each line in the file so that each location gets its own line in the file.
Check your file to make sure it looks good
We could clean this up a bit more by skipping over the line in the CSV file that contains the headers, like “Location 1″. One easy way to do this is to keep track of which row of the file we are on while we’re looping through it, and skip the first one (which will be row 0).
To implement a counter, we need to define a variable before our loop begins, and increment it by one each time we go through the loop (= each row in the file)
i=0
i+=1
To skip the first row, we just need to test if we are on line 0 or not. Another way of thinking about it is that we only want to write to our file if we are on line 1 or greater (ie not 0).
if (i > 0):
Notice the importance of indentation. We don’t want our i+=1 code to part of the if block, or it will never run!
You can use this same logic to help yourself work with more manageable files. Let’s say you have a big CSV file, and you are hoping to geolocate all the places. Rather than test your code on a big file that can take a lot of time and introduce hard to find errors, it’s often easier to just extract a subset of the data and go back to the big file later. Especially for messy historical data, it is good practice to make sure your logic and general process works on well-formed data, then try bigger subsets and deal with problems that the messy data will introduce (and it will! ). In other words, get everything working for a small amount of data, then scale up.
In our case, we can limit the size of our output file by not writing to the file if our counter gets past some threshold. So if we only wanted to write the first 2 lines, we can add that constraint to our existing “if” statement (line 11) that checks to see if we are on line 0 of our CSV file.
if (i > 0 and i < 2): i+=1 How to Read and Write CSV Files in Python

How to Read and Write CSV Files in Python

Read Time:7 minsLanguages:
The CSV format is the most commonly used import and export format for databases and spreadsheets. This tutorial will give a detailed introduction to CSV’s and the modules and classes available for reading and writing data to CSV files. It will also cover a working example to show you how to read and write data to a CSV file in Python.
What Is a CSV File?
A CSV (comma separated values) file allows data to be saved in a tabular structure with a extension. CSV files have been used extensively in e-commerce applications because they are considered very easy to process. Some of the areas where they have been used include:
importing and exporting customer data
importing and exporting products
exporting orders
exporting e-commerce analytic reports
Reader and Writer Modules
The CSV module has several functions and classes available for reading and writing CSVs, and they include:
function
csv. Dictwriter class
csv. DictReader class
The module takes the following parameters:
csvfile: This is usually an object which supports the iterator protocol and usually returns a string each time its __next__() method is called.
dialect=’excel’: An optional parameter used to define a set of parameters specific to a particular CSV dialect.
fmtparams: An optional parameter that can be used to override existing formatting parameters.
Here is an example of how to use the module.
import csv
with open(”, newline=”) as File:
reader = (File)
for row in reader:
print(row) module
This module is similar to the module and is used to write data to a CSV. It takes three parameters:
csvfile: This can be any object with a write() method.
dialect=’excel’: An optional parameter used to define a set of parameters specific to a particular CSV.
fmtparam: An optional parameter that can be used to override existing formatting parameters.
DictReader and DictWriter Classes
The DictReader and DictWriter are classes available in Python for reading and writing to CSV. Although they are similar to the reader and writer functions, these classes use dictionary objects to read and write to csv files.
DictReader
It creates an object which maps the information read into a dictionary whose keys are given by the fieldnames parameter. This parameter is optional, but when not specified in the file, the first row data becomes the keys of the dictionary.
Example:
with open(”) as csvfile:
reader = csv. DictReader(csvfile)
print(row[‘first_name’], row[‘last_name’])
DictWriter
This class is similar to the DictWriter class and does the opposite, which is writing data to a CSV file. The class is defined as csv. DictWriter(csvfile, fieldnames, restval=”, extrasaction=’raise’, dialect=’excel’, *args, **kwds)
The fieldnames parameter defines the sequence of keys that identify the order in which values in the dictionary are written to the CSV file. Unlike the DictReader, this key is not optional and must be defined in order to avoid errors when writing to a CSV.
Dialects and Formatting
A dialect is a helper class used to define the parameters for a specific reader or writer instance. Dialects and formatting parameters need to be declared when performing a reader or writer function.
There are several attributes which are supported by a dialect:
delimiter: A string used to separate fields. It defaults to ‘, ‘.
double quote: Controls how instances of quotechar appearing inside a field should be quoted. Can be True or False.
escapechar: A string used by the writer to escape the delimiter if quoting is set to QUOTE_NONE.
lineterminator: A string used to terminate lines produced by the writer. It defaults to ‘\r\n’.
quotechar: A string used to quote fields containing special characters. It defaults to ‘”‘.
skipinitialspace: If set to True, any white space immediately following the delimiter is ignored.
strict: If set to True, it raises an exception Error on bad CSV input.
quoting: Controls when quotes should be generated when reading or writing to a CSV.
Reading a CSV File
Let’s see how to read a CSV file using the helper modules we have discussed above.
Create your CSV file and save it as Ensure that it has the extension and fill in some data. Here we have our CSV file which contains the names of students and their grades.
Below is the code for reading the data in our CSV using both the function and the csv. DictReader class.
Reading a CSV File With
with open(”) as File:
reader = (File, delimiter=’, ‘, quotechar=’, ‘,
quoting=csv. QUOTE_MINIMAL)
print(row)
In the code above, we import the CSV module and then open our CSV file as File. We then define the reader object and use the method to extract the data into the object. We then iterate over the reader object and retrieve each row of our data.
We show the read data by printing its contents to the console. We have also specified the required parameters such as delimiter, quotechar, and quoting.
Output
[‘first_name’, ‘last_name’, ‘Grade’]
[‘Alex’, ‘Brian’, ‘B’]
[‘Rachael’, ‘Rodriguez’, ‘A’]
[‘Tom’, ‘smith’, ‘C’]
Reading a CSV File With DictReader
As we mentioned above, DictWriter allows us to read a CSV file by mapping the data to a dictionary instead of strings as in the case of the module. Although the fieldname is an optional parameter, it’s important to always have your columns labelled for readability.
Here’s how to read a CSV using the DictWriter class.
results = []
reader = csv. DictReader(File)
(row)
print results
We first import the csv module and initialize an empty list results which we will use to store the data retrieved. We then define the reader object and use the csv. DictReader method to extract the data into the object. We then iterate over the reader object and retrieve each row of our data.
Finally, we append each row to the results list and print the contents to the console.
[{‘Grade’: ‘B’, ‘first_name’: ‘Alex’, ‘last_name’: ‘Brian’},
{‘Grade’: ‘A’, ‘first_name’: ‘Rachael’, ‘last_name’: ‘Rodriguez’},
{‘Grade’: ‘C’, ‘first_name’: ‘Tom’, ‘last_name’: ‘smith’},
{‘Grade’: ‘B’, ‘first_name’: ‘Jane’, ‘last_name’: ‘Oscar’},
{‘Grade’: ‘A’, ‘first_name’: ‘Kennzy’, ‘last_name’: ‘Tim’}]
As you can see above, using the DictReader class is better because it gives out our data in a dictionary format which is easier to work with.
Writing to a CSV File
Let’s now see how to go about writing data into a CSV file using the function and the csv. Dictwriter class discussed at the beginning of this tutorial.
Writing to a CSV File Using
The code below writes the data defined to the file.
myData = [[“first_name”, “second_name”, “Grade”],
[‘Alex’, ‘Brian’, ‘A’],
[‘Tom’, ‘Smith’, ‘B’]]
myFile = open(”, ‘w’)
with myFile:
writer = (myFile)
writer. writerows(myData)
print(“Writing complete”)
First we import the csv module, and the writer() function will create an object suitable for writing. To iterate the data over the rows, we will need to use the writerows() function.
Here is our CSV with the data we have written to it.
Writing to a CSV File Using DictWriter
Let’s write the following data to a CSV.
data = [{‘Grade’: ‘B’, ‘first_name’: ‘Alex’, ‘last_name’: ‘Brian’},
The code is as shown below.
with open(”, ‘w’) as csvfile:
fieldnames = [‘first_name’, ‘last_name’, ‘Grade’]
writer = csv. DictWriter(csvfile, fieldnames=fieldnames)
writer. writeheader()
writer. writerow({‘Grade’: ‘B’, ‘first_name’: ‘Alex’, ‘last_name’: ‘Brian’})
writer. writerow({‘Grade’: ‘A’, ‘first_name’: ‘Rachael’,
‘last_name’: ‘Rodriguez’})
writer. writerow({‘Grade’: ‘B’, ‘first_name’: ‘Jane’, ‘last_name’: ‘Oscar’})
writer. writerow({‘Grade’: ‘B’, ‘first_name’: ‘Jane’, ‘last_name’: ‘Loive’})
We first define the fieldnames, which will represent the headings of each column in the CSV file. The writerrow() method will write to one row at a time. If you want to write all the data at once, you will use the writerrows() method.
Here is how to write to all the rows at once.
writer. writerows([{‘Grade’: ‘B’, ‘first_name’: ‘Alex’, ‘last_name’: ‘Brian’},
{‘Grade’: ‘A’, ‘first_name’: ‘Rachael’,
‘last_name’: ‘Rodriguez’},
{‘Grade’: ‘A’, ‘first_name’: ‘Kennzy’, ‘last_name’: ‘Tim’}])
print(“writing complete”)
Conclusion
This tutorial has covered most of what is required to be able to successfully read and write to a CSV file using the different functions and classes provided by Python. CSV files have been widely used in software applications because they are easy to read and manage and their small size makes them relatively fast to process and transfer.
Don’t hesitate to see what we have available for sale and for study in the marketplace, and don’t hesitate to ask any questions and provide your valuable feedback using the feed below.
Learn Python
Learn Python with our complete python tutorial guide, whether you’re just getting started or you’re a seasoned coder looking to learn new skills.
Software developer Software developer and content creator. Student of Life | #Pythonist | Loves to code and write Tutorials

Frequently Asked Questions about how to extract specific data from csv file in python

How do I get the specific data of a CSV file in Python?

We first import the csv module and initialize an empty list results which we will use to store the data retrieved. We then define the reader object and use the csv. DictReader method to extract the data into the object. We then iterate over the reader object and retrieve each row of our data.Dec 5, 2017

How do I extract a row from a csv file in Python?

“python script to extract rows from csv file” Code Answer#import necessary modules.import csv.with open(‘X:\data.csv’,’rt’)as f:data = csv. reader(f)for row in data:print(row)​Aug 31, 2020

How do I retrieve data from a CSV file?

Step 1) To read data from CSV files, you must use the reader function to generate a reader object. The reader function is developed to take each row of the file and make a list of all columns. Then, you have to choose the column you want the variable data for.Oct 7, 2021

Leave a Reply

Your email address will not be published. Required fields are marked *