Programming‎ > ‎Syllabus‎ > ‎

Style Tips

Excerpts from http://swc.scipy.org/lec/style.html

Reading is Learning

  • People doing creative work in almost every field routinely inspect, dissect, and critique what's come before

    • But most student programmers only ever read short fragments in textbooks
    • Like reading sonnets, then trying to write a novel

     

    • Knowing how to read code is as useful as knowing how to read a proof
      • Have to do it in order to figure out how to make specific changes to specific programs
      • A good way to learn new things

    Seven Plus or Minus

    • The average person's short term memory can hold 7±2 items [Hock 2004]
      • Seven random digits (as in phone numbers)
      • Seven tasks that still have to be done

       

      • If we try to remember more than that, we:

        • Make mistakes, or
        • Create chunks so that we can remember things at a higher level
          • Common chord progressions in music
          • “Castled kingside” instead of the positions of five separate pieces

        What Does This Have to Do With Programming?

        • When reading and writing code, you have to keep a bunch of facts straight for a short period of time
          • What do this function's parameters mean?
          • What does this loop's index refer to?

           

          • The more odds and ends readers have to keep track of, the more errors they will make

            • Goal of style rules is therefore to reduce the number of things the reader has to juggle mentally

           

          • The greater a difference is, the more likely we are to notice it
            • So every semantic difference ought to be visually distinct…
            • …and every difference in naming or layout ought to mean something

             

            • Most important thing is to be consistent
              • Anything consistent is readable after a while
              • Just watch kids learning to read French, Punjabi, and Korean

              Python Style Guide

              • Taken from PEP-008: Python Style Guide
                • Stick to this unless you have hard data that proves something else is better
              • Basic layout
                • Indent blocks using four spaces
                • Keep lines less than 80 characters long
                • Separate functions with two blank lines
                • Separate logical chunks of long functions with a single blank line
                • Put comments on lines of their own, rather than to the right of code
                Rule Good Bad
                No whitespace immediately inside parentheses max(candidates[sublist]) max( candidates[ sublist ] )
                …or before the parenthesis starting indexing or slicing   max (candidates [sublist] )
                No whitespace immediately before comma or colon if limit > 0: print minimum, limit if limit > 0 : print minimum , limit
                Use space around arithmetic and in-place operators x += 3 * 5 x+=3*5
                No spaces when specifying default parameter values def integrate(func, start=0.0, interval=1.0) def integrate(func, start = 0.0, interval = 1.0)
                Never use names that are distinguished only by "l", "1", "0", or "O" tempo_long and tempo_init tempo_l and tempo_1
                Short lower-case names for modules (i.e., files) geology Geology or geology_package
                Upper case with underscores for constants TOLERANCE or MAX_AREA Tolerance or MaxArea
                Camel case for class names SingleVariableIntegrator single_variable_integrator
                Lowercase with underscores for function and method names divide_region divRegion
                …and member variables max_so_far maxSoFar
                Use is and is not when comparing to special values if current is not None: if current != None:
                Use isinstance when checking types if isinstance(current, Rock): if type(current) == Rock:
                Table 10.1: Basic Python Style Rules
                   

                  Naming

                  • Names of files, classes, methods, variables, and other things are the most visible clue to purpose
                    • A variable called temperature shouldn't be used to store the number of pottery shards found at a dig site

                   

                  • Choose names that are both meaningful and readable
                    • current_surface_temperature_of_probe is meaningful, but not readable
                    • cstp is easier to read, but hard to understand…
                    • …and easy to confuse with ctsp

                     

                    • If you must abbreviate, be consistent
                      • curr_ave_temp instead of current_average_temperature is OK…
                      • …but only if no one else is using curnt_av_tmp

                    Scope and Size

                    • The smaller the scope of a name, the more compact it can be
                      • It's OK to use i and j for indices in tightly-nested for loops
                      • But not OK if the loop bodies are several pages long
                        • Of course, they shouldn't be anyway…

                       

                      • The wider the scope of a name, the more descriptive it has to be
                        • Call a class ExperimentalRecord, rather than ER or ExpRec
                       

                      Function Length

                      • Every function should do exactly one job
                        • Should be able to describe that job in a single memorable sentence
                        • If that sentence is five phrases joined with “and”, the function should be split up

                         

                        • A good way to judge size and scope is the notion of a program slice

                          • The subset of names in scope at a particular statement that are needed to understand what that statement does
                          • If the slice is much smaller than the method itself, the method is probably bloated

                           

                          • A thousand one-line methods are not an improvement over one thousand-line method
                           

                          Documentation

                          • Requirements
                            • What needs the software is supposed to meet
                            • “If more than 100 events arrive in a second, the seismograph interface must store them in a queue“ tells you to look for a seismograph interface, and a queue that feeds it data

                             

                            • User guide

                              • Actually just another way to specify requirements

                             

                            • Architectural descriptions
                              • Architecture is what you draw on the whiteboard when explaining the program to other people
                              • Shows relationships between major modules, data flow, etc.
                              • Gives other programmers a mental map of how everything fits together

                            More On Documentation

                            • At start of functions (or classes, or methods) to explain what they're for
                              • “This function returns the first pair of non-overlapping subsequences that match the input pattern, or null otherwise”
                              • The programmer's guide to the software

                               

                              • Embedded in code to explain tricky bits
                                • “This is a while loop, instead of a for, because it may delete items from the list as it goes”
                                • In general, if you need to explain the code, you ought to simplify it instead
                               

                              Embedding Documentation

                              • Embedded documentation is more likely to be up to date than external documentation
                            • Javadoc translates specially-formatted comments into HTML
                                • Java
                                • /**
                                   * Returns the least common ancestor of two species based on DNA
                                   * comparison, with certainty no less than the specified threshold.
                                   * Note that getConcestor(X, X, t) returns X for any threshold.
                                   *
                                   * @param left        one of the base species for the search
                                   * @param right       the other base species for the search
                                   * @param threshold   the degree of certainty required
                                   * @return            the common ancestor, or null if none is found
                                   * @see               Species
                                   */
                                  public Species getConcestor(Species left, Species right, float threshold) {
                                      ...implementation...
                                  }
                                  
                                • Documentation web page

                                    getConcestor

                                    public Species getConcestor(Species left, Species right, float threshold)

                                    Returns the least common ancestor of two species based on DNA comparison, with certainty no less than the specified threshold. Note that getConcestor(X, X, t) returns X for any threshold.

                                    Parameters:

                                    left - one of the base species for the search

                                    right - the other base species for the search

                                    threshold - the degree of certainty required

                                    Parameters:

                                    the common ancestor, or null if none is found

                                    See Also:

                                    Image

                                 


                                  Docstrings

                                  • Python uses documentation strings (or docstrings) instead of comments
                                    • A string at the start of a module or function that isn't assigned to anything becomes the object's __doc__ attribute
                                    • Unlike a comment, it's there at runtime
                                    '''This module provides functions that search and compare genomes.
                                    All functions assume that their input arguments are in valid CCSN-2
                                    format; unless specifically noted, they do not modify their arguments,
                                    print, or have other side effects.
                                    '''
                                    
                                    __version__ = '$Revision: 497$'
                                    
                                    def get_concestor(left, right, threshold):
                                        '''Find the least common ancestor of two species.
                                    
                                        This function searches for a least common ancestor based on DNA
                                        comparison with certainty no less than the specified threshold.
                                        If one can be found, it is returned; otherwise, the function
                                        returns None.  get_concestor(X, X, t) returns X for any threshold.
                                    
                                        left      : one of the base species for the search
                                        right     : the other base species for the search
                                        threshold : the degree of certainty required
                                        '''
                                    
                                        pass # implementation would go here
                                    
                                    $ python
                                    >>> import genome
                                    >>> print genome.__doc__
                                    This module provides functions that search and compare genomes.
                                    All functions assume that their input arguments are in valid CCSN-2
                                    format; unless specifically noted, they do not modify their arguments,
                                    print, or have other side effects.
                                    
                                    >>> print genome.get_concestor.__doc__
                                    Find the least common ancestor of two species.
                                    
                                        This function searches for a least common ancestor based on DNA
                                        comparison with certainty no less than the specified threshold.
                                        If one can be found, it is returned; otherwise, the function
                                        returns None.  get_concestor(X, X, t) returns X for any threshold.
                                    
                                        left      : one of the base species for the search
                                        right     : the other base species for the search
                                        threshold : the degree of certainty required
                                    
                                    
                                  • Python's Docutils will extract, format, and cross-reference docstrings
                                  •  


                                      Summary

                                      • Code and documentation decay over time [Eick et al 2001]
                                      • To prevent this, must:
                                        • Make good style a habit
                                        • Back it up with automated checks
                                      • Remember, we don't actually write programs for the benefit of computers
                                        • It takes a lot of very sophisticated software to translate our programs into a form computers understand
                                      • We write programs for other people
                                        • Our colleagues
                                        • Our future selves
                                            •