Skip to content Skip to sidebar Skip to footer

How Do I Sum a Group of Numbers Read From a Text File in Python

Reading and Writing Text Files

Overview

Instruction: lx min
Exercises: 30 min

Questions

  • How can I read in data that is stored in a file or write data out to a file?

Objectives

  • Exist able to open a file and read in the data stored in that file

  • Understand the difference between the file name, the opened file object, and the data read in from the file

  • Be able to write output to a text file with uncomplicated formatting

Why practice we want to read and write files?

Existence able to open up and read in files allows united states of america to work with larger information sets, where it wouldn't be possible to type in each and every value and store them ane-at-a-time as variables. Writing files allows us to process our data and then save the output to a file then we can look at it later.

Right now, we will practice working with a comma-delimited text file (.csv) that contains several columns of information. All the same, what you acquire in this lesson tin can be applied to any general text file. In the adjacent lesson, you will learn some other way to read and process .csv information.

Paths to files

In order to open a file, nosotros demand to tell Python exactly where the file is located, relative to where Python is currently working (the working directory). In Spyder, we can do this by setting our current working directory to the binder where the file is located. Or, when we provide the file name, nosotros tin can requite a complete path to the file.

Lesson Setup

We will work with the exercise file Plates_output_simple.csv.

  1. Locate the file Plates_output_simple.csv in the directory home/Desktop/workshops/bash-git-python.
  2. Copy the file to your working directory, habitation/Desktop/workshops/YourName.
  3. Make certain that your working directory is also ready to the folder home/Desktop/workshops/YourName.
  4. As you are working, make sure that y'all save your file opening script(s) to this directory.

The File Setup

Let's open and examine the structure of the file Plates_output_simple.csv. If you open up the file in a text editor, y'all will see that the file contains several lines of text.

DataFileRaw

Nevertheless, this is fairly difficult to read. If you open the file in a spreadsheet programme such as LibreOfficeCalc or Excel, you tin can see that the file is organized into columns, with each column separated by the commas in the prototype higher up (hence the file extension .csv, which stands for comma-separated values).

DataFileColumns

The file contains i header row, followed by eight rows of information. Each row represents a single plate prototype. If nosotros look at the column headings, we can see that we have collected data for each plate:

  • The proper name of the prototype from which the data was nerveless
  • The plate number (there were 4 plates, with each plate imaged at two unlike fourth dimension points)
  • The growth condition (either control or experimental)
  • The observation timepoint (either 24 or 48 hours)
  • Colony count for the plate
  • The average colony size for the plate
  • The percentage of the plate covered by bacterial colonies

Nosotros will read in this data file and then piece of work to clarify the data.

Opening and reading files is a 3-stride process

Nosotros will open and read the file in 3 steps.

  1. We will create a variable to hold the name of the file that we want to open.
  2. We volition phone call a open to open the file.
  3. We will call a function to actually read the data in the file and store it in a variable so that we can process it.

And so, there's i more step to practice!

  • When we are done, we should remember to close the file!

You tin recall of these three steps as being similar to checking out a book from the library. First, you have to go to the itemize or database to find out which volume you need (the filename). Then, yous accept to go and get information technology off the shelf and open the book upward (the open role). Finally, to proceeds whatever information from the book, you have to read the words (the read function)!

Here is an example of opening, reading, and closing a file.

                          #Create a variable for the file name              filename              =              'Plates_output_simple.csv'              #This is but a string of text              #Open the file              infile              =              open              (              filename              ,              'r'              )              # 'r' says we are opening the file to read, infile is the opened file object that nosotros volition read from              #Store the data from the file in a variable              data              =              infile              .              read              ()              #Print the data in the file              print              (              data              )              #close the file              infile              .              close              ()                      

In one case we have read the data in the file into our variable data, nosotros tin treat it like any other variable in our lawmaking.

Utilise consequent names to make your lawmaking clearer

It is a good idea to develop some consistent habits virtually the mode you open and read files. Using the same (or similar!) variable names each time will get in easier for y'all to keep track of which variable is the name of the file, which variable is the opened file object, and which variable contains the read-in information.

In these examples, nosotros will use filename for the text string containing the file proper noun, infile for the open file object from which nosotros can read in data, and data for the variable holding the contents of the file.

Commands for reading in files

There are a variety of commands that allow us to read in information from files.
infile.read() will read in the unabridged file as a single string of text.
infile.readline() will read in one line at a time (each time you lot call this command, it reads in the next line).
infile.readlines() volition read all of the lines into a list, where each line of the file is an item in the listing.

Mixing these commands can have some unexpected results.

                          #Create a variable for the file proper name              filename              =              'Plates_output_simple.csv'              #Open the file              infile              =              open              (              filename              ,              'r'              )              #Print the first 2 lines of the file              impress              (              infile              .              readline              ())              print              (              infile              .              readline              ())              #phone call infile.read()              print              (              infile              .              read              ())              #shut the file              infile              .              close              ()                      

Notice that the infile.read()command started at the third line of the file, where the kickoff two infile.readline() commands left off.

Remember of it like this: when the file is opened, a pointer is placed at the top left corner of the file at the first of the first line. Any time a read function is called, the cursor or pointer advances from where it already is. The first infile.readline() started at the beginning of the file and advanced to the finish of the start line. Now, the pointer is positioned at the kickoff of the second line. The second infile.readline() avant-garde to the stop of the 2nd line of the file, and left the pointer positioned at the beginning of the third line. infile.read() began from this position, and advanced through to the end of the file.

In general, if you lot want to switch betwixt the different kinds of read commands, yous should close the file then open it again to starting time over.

Reading all of the lines of a file into a list

infile.readlines() will read all of the lines into a listing, where each line of the file is an item in the list. This is extremely useful, considering once we have read the file in this way, we can loop through each line of the file and procedure it. This arroyo works well on data files where the data is organized into columns similar to a spreadsheet, because information technology is likely that we will desire to handle each line in the aforementioned way.

The case below demonstrates this approach:

                          #Create a variable for the file proper noun              filename              =              "Plates_output_simple.csv"              #Open the file              infile              =              open              (              filename              ,              'r'              )              lines              =              infile              .              readlines              ()              for              line              in              lines              :              #lines is a list with each item representing a line of the file              if              'command'              in              line              :              impress              (              line              )              #print lines for control condition              infile              .              close              ()              #close the file when you're washed!                      

Using .split() to separate "columns"

Since our information is in a .csv file, we can use the split command to separate each line of the file into a listing. This can be useful if we desire to access specific columns of the file.

                          #Create a variable for the file proper name                            filename              =              "Plates_output_simple.csv"              #Open the file              infile              =              open up              (              filename              ,              'r'              )              lines              =              infile              .              readlines              ()              for              line              in              lines              :              sline              =              line              .              split              (              ','              )              # separates line into a list of items.  ',' tells it to split the lines at the commas              impress              (              sline              )              #each line is now a listing              infile              .              close              ()              #Ever close the file!                      

Consistent names, again

At get-go glance, the variable name sline in the example above may not brand much sense. In fact, we chose it to exist an abridgement for "split up line", which exactly describes the contents of the variable.

You don't have to use this naming convention if you lot don't want to, merely you should work to utilise consistent variable names across your code for common operations similar this. It will make information technology much easier to open an old script and speedily understand exactly what it is doing.

Converting text to numbers

When nosotros called the readlines() command in the previous code, Python reads in the contents of the file as a cord. If we desire our code to recognize something in the file as a number, we need to tell information technology this!

For example, float('5.0') volition tell Python to care for the text cord 'five.0' every bit the number five.0. int(sline[4]) will tell our code to care for the text string stored in the fifth position of the list sline as an integer (not-decimal) number.

For each line in the file, the ColonyCount is stored in the 5th cavalcade (index 4 with our 0-based counting).
Modify the code above to impress the line only if the ColonyCount is greater than xxx.

Solution

                                  #Create a variable for the file name                  filename                  =                  'Plates_output_simple.csv'                  ##Open the file                  infile                  =                  open                  (                  filename                  ,                  'r'                  )                  lines                  =                  infile                  .                  readlines                  ()                  for                  line                  in                  lines                  [                  1                  :]:                  #skip the start line, which is the header                  sline                  =                  line                  .                  split up                  (                  ','                  )                  # separates line into a list of items.  ',' tells it to separate the lines at the commas                  colonyCount                  =                  int                  (                  sline                  [                  four                  ])                  #shop the colony count for the line as an integer                  if                  colonyCount                  >                  thirty                  :                  impress                  (                  sline                  )                  #shut the file                  infile                  .                  close                  ()                              

Writing data out to a file

Often, we volition desire to write data to a new file. This is peculiarly useful if nosotros accept done a lot of computations or data processing and we desire to exist able to save it and come back to it later.

Writing a file is the same multi-pace process

Just similar reading a file, we volition open and write the file in multiple steps.

  1. Create a variable to agree the name of the file that we desire to open up. Oft, this will exist a new file that doesn't yet be.
  2. Call a office to open the file. This time, we will specify that we are opening the file to write into it!
  3. Write the data into the file. This requires some conscientious attending to formatting.
  4. When we are done, we should remember to shut the file!

The code below gives an instance of writing to a file:

                          filename              =              "output.txt"              #w tells python we are opening the file to write into it              outfile              =              open              (              filename              ,              'w'              )              outfile              .              write              (              "This is the kickoff line of the file"              )              outfile              .              write              (              "This is the second line of the file"              )              outfile              .              close              ()              #Shut the file when nosotros're done!                      

Where did my file terminate upwards?

Any time yous open up a new file and write to it, the file volition be saved in your current working directory, unless you specified a different path in the variable filename.

Newline characters

When you examine the file you lot just wrote, you lot will see that all of the text is on the same line! This is considering we must tell Python when to starting time on a new line by using the special string character '\due north'. This newline character will tell Python exactly where to starting time each new line.

The example below demonstrates how to use newline characters:

                          filename              =              'output_newlines.txt'              #due west tells python we are opening the file to write into it              outfile              =              open              (              filename              ,              'w'              )              outfile              .              write              (              "This is the first line of the file              \n              "              )              outfile              .              write              (              "This is the second line of the file              \n              "              )              outfile              .              close              ()              #Shut the file when nosotros're done!                      

Become open the file you just wrote and and check that the lines are spaced correctly.:

Dealing with newline characters when you lot read a file

You may have noticed in the concluding file reading case that the printed output included newline characters at the end of each line of the file:

['colonies02.tif', 'two', 'exp', '24', '84', '3.ii', '22\n']
['colonies03.tif', '3', 'exp', '24', '792', 'three', '78\n']
['colonies06.tif', '2', 'exp', '48', '85', '5.two', '46\northward']

Nosotros tin go rid of these newlines by using the .strip() function, which will get rid of newline characters:

                              #Create a variable for the file name                filename                =                'Plates_output_simple.csv'                ##Open the file                infile                =                open                (                filename                ,                'r'                )                lines                =                infile                .                readlines                ()                for                line                in                lines                [                1                :]:                #skip the first line, which is the header                sline                =                line                .                strip                ()                #get rid of trailing newline characters at the end of the line                sline                =                sline                .                dissever                (                ','                )                # separates line into a list of items.  ',' tells it to carve up the lines at the commas                colonyCount                =                int                (                sline                [                iv                ])                #store the colony count for the line every bit an integer                if                colonyCount                >                xxx                :                print                (                sline                )                #close the file                infile                .                close                ()                          

Writing numbers to files

Merely like Python automatically reads files in as strings, the write()function expects to only write strings. If we want to write numbers to a file, we will need to "cast" them as strings using the function str().

The code beneath shows an case of this:

                          numbers              =              range              (              0              ,              x              )              filename              =              "output_numbers.txt"              #west tells python we are opening the file to write into it              outfile              =              open              (              filename              ,              'westward'              )              for              number              in              numbers              :              outfile              .              write              (              str              (              number              ))              outfile              .              close              ()              #Close the file when we're done!                      

Writing new lines and numbers

Go open and examine the file you just wrote. Yous will run into that all of the numbers are written on the aforementioned line.

Alter the code to write each number on its own line.

Solution

                                  numbers                  =                  range                  (                  0                  ,                  x                  )                  #Create the range of numbers                  filename                  =                  "output_numbers.txt"                  #provide the file proper name                  #open the file in 'write' mode                  outfile                  =                  open                  (                  filename                  ,                  'due west'                  )                  for                  number                  in                  numbers                  :                  outfile                  .                  write                  (                  str                  (                  number                  )                  +                  '                  \n                  '                  )                  outfile                  .                  shut                  ()                  #Close the file when we're done!                              

The file you just wrote should be saved in your Working Directory. Open the file and check that the output is correctly formatted with one number on each line.

Opening files in different 'modes'

When we have opened files to read or write information, we have used the function parameter 'r' or 'w' to specify which "way" to open the file.
'r' indicates we are opening the file to read data from information technology.
'due west' indicates we are opening the file to write data into it.

Be very, very careful when opening an existing file in 'w' mode.
'w' will over-write any data that is already in the file! The overwritten data volition be lost!

If you want to add together on to what is already in the file (instead of erasing and over-writing it), you tin can open up the file in append style by using the 'a' parameter instead.

Pulling information technology all together

Read in the data from the file Plates_output_simple.csv that we accept been working with. Write a new csv-formatted file that contains only the rows for control plates.
You volition need to practice the following steps:

  1. Open the file.
  2. Apply .readlines() to create a list of lines in the file. Then close the file!
  3. Open up a file to write your output into.
  4. Write the header line of the output file.
  5. Use a for loop to allow you to loop through each line in the listing of lines from the input file.
  6. For each line, check if the growth condition was experimental or control.
  7. For the control lines, write the line of information to the output file.
  8. Close the output file when yous're done!

Solution

Here's one way to exercise it:

                                  #Create a variable for the file name                  filename                  =                  'Plates_output_simple.csv'                  ##Open the file                  infile                  =                  open                  (                  filename                  ,                  'r'                  )                  lines                  =                  infile                  .                  readlines                  ()                  #We volition procedure the lines of the file subsequently                  #close the input file                  infile                  .                  close                  ()                  #Create the file we volition write to                  filename                  =                  'ControlPlatesData.txt'                  outfile                  =                  open                  (                  filename                  ,                  'w'                  )                  outfile                  .                  write                  (                  lines                  [                  0                  ])                  #This will write the header line of the file                                    for                  line                  in                  lines                  [                  1                  :]:                  #skip the first line, which is the header                  sline                  =                  line                  .                  split                  (                  ','                  )                  # separates line into a list of items.  ',' tells it to divide the lines at the commas                  condition                  =                  sline                  [                  2                  ]                  #shop the condition for the line as a string                  if                  condition                  ==                  "control"                  :                  outfile                  .                  write                  (                  line                  )                  #The variable line is already formatted correctly!                  outfile                  .                  close                  ()                  #Close the file when we're done!                              

Challenge Trouble

Open and read in the data from Plates_output_simple.csv. Write a new csv-formatted file that contains only the rows for the control condition and includes merely the columns for Time, colonyCount, avgColonySize, and percentColonyArea. Hint: you tin use the .bring together() function to bring together a list of items into a string.

                              names                =                [                'Erin'                ,                'Mark'                ,                'Tessa'                ]                nameString                =                ', '                .                join                (                names                )                #the ', ' tells Python to bring together the list with each item separated past a comma + space                print                (                nameString                )                          

'Erin, Mark, Tessa'

Solution

                                  #Create a variable for the input file proper noun                  filename                  =                  'Plates_output_simple.csv'                  ##Open the file                  infile                  =                  open up                  (                  filename                  ,                  'r'                  )                  lines                  =                  infile                  .                  readlines                  ()                  #We volition process the lines of the file after                  #close the file                  infile                  .                  close                  ()                  # Create the file we will write to                  filename                  =                  'ControlPlatesData_Reduced.txt'                  outfile                  =                  open                  (                  filename                  ,                  'w'                  )                  #Write the header line                  headerList                  =                  lines                  [                  0                  ]                  .                  split                  (                  ','                  )[                  3                  :]                  #This will return the list of column headers from 'time' on                  headerString                  =                  ','                  .                  join                  (                  headerList                  )                  #join the items in the list with commas                  outfile                  .                  write                  (                  headerString                  )                  #In that location is already a newline at the cease, and so no need to add i                  #Write the remaining lines                  for                  line                  in                  lines                  [                  1                  :]:                  #skip the showtime line, which is the header                  sline                  =                  line                  .                  divide                  (                  ','                  )                  # separates line into a list of items.  ',' tells it to carve up the lines at the commas                  condition                  =                  sline                  [                  2                  ]                  #store the colony count for the line as an integer                  if                  condition                  ==                  "control"                  :                  dataList                  =                  sline                  [                  3                  :]                  dataString                  =                  ','                  .                  bring together                  (                  dataList                  )                  outfile                  .                  write                  (                  dataString                  )                  #The variable line is already formatted correctly!                  outfile                  .                  shut                  ()                  #Shut the file when we're done!                              

Key Points

  • Opening and reading a file is a multistep procedure: Defining the filename, opening the file, and reading the data

  • Data stored in files tin be read in using a variety of commands

  • Writing data to a file requires attention to information types and formatting that isn't necessary with a print() statement

sterngoomencirt1968.blogspot.com

Source: https://eldoyle.github.io/PythonIntro/08-ReadingandWritingTextFiles/

Post a Comment for "How Do I Sum a Group of Numbers Read From a Text File in Python"