+ Reply to Thread
Results 1 to 12 of 12

Thread: Update cursor with joined tables work around w/ dictionaries

  1. #1
    Mathew Coyle

    Join Date
    Feb 2011
    Posts
    1,369
    Points
    958
    Answers Provided
    150


    3

    Lightbulb Update cursor with joined tables work around w/ dictionaries

    This post could easily be called "How I fell in love with dictionaries"

    Drawing the idea from this post http://forums.arcgis.com/threads/525...ctor-one-liner

    I've come up with a solution to a nagging problem I know I have been having, and I believe some others have as well, of not being able to reliably use an update cursor when dealing with joined tables. I was really happy with my first foray into dictionaries, and I thought I'd share my work around for anyone looking to optimize some tedious processing with joins. My data was ~900k rows of forest stand data in one table, and a strata reference table of ~50 rows to calculate volumes. My previous method of using a permanent JoinField, processing, then deleting those fields, took approximately 3.5 hours. Temporary joins never worked for me in the manner I needed. Using dictionaries instead of joins, that time was reduced to under 15 minutes.

    This code goes through any table and creates a list of field names for every field other than OID and the key field you want to reference.

    Here is the fairly complete code to create the dictionary
    Code:
        print "Starting function"
        # Define and setup variables, tables, key field etc
        calc_table = arcpy.MakeTableView_management(table_path)
        vol_tab = join_table_path
        strata_tab = "in_memory/temp"
        arcpy.MakeTableView_management(vol_tab, strata_tab)
        joinField = "STRATA"
        
        # Create list of value fields, leaving out OID field and key/join field
        flistObj = arcpy.ListFields(strata_tab)
        flist = []
        for f in flistObj:
            if f.type != "OID" and f.name != joinField:
                flist.append(f.name)
    
        # Create empty dict object then populate each key with a sub dict by row using value fields as keys
        strataDict = {}
    
        for r in arcpy.SearchCursor(strata_tab):
            fieldvaldict = {}
            for field in flist:
                fieldvaldict[field] = r.getValue(field)
            strataDict[r.getValue(joinField)] = fieldvaldict
    
        del strata_tab, flistObj
    In the update cursor you can then either explicitly reference dictionary objects like this
    Code:
        rows = arcpy.UpdateCursor(calc_table, "\"%s\" IS NOT NULL" % joinField)
        for row in rows:
            strata = row.getValue(joinField)
            variable = strataDict[strata]["sub_key_field"]
    What I did was use a reference list to reference the dictionary to keep things legible, and so I could remember what went where. This may not even be necessary for some people, but it helped me conceptually. Without getting in to too much detail, here's essentially my update cursor sans the actual calculations.
    Code:
        species = [
        ("C","Fb","FB_STEMS"),("C","Sw","SW_STEMS"),("C","Pj","PJ_STEMS"), # 0,1,2
        ("C","Pl","PJ_STEMS"),("C","Lt","LT_STEMS"),("C","Sb","SB_STEMS"), # 3,4,5
        ("D","Bw","BW_STEMS"),("D","Aw","AW_STEMS"),("D","Pb","PB_STEMS")  # 6,7,8
        ]
        sp_fields = [("SP1","SP1_PER"),("SP2","SP2_PER"),("SP3","SP3_PER"),
        ("SP4","SP4_PER"),("SP5","SP5_PER")]
        print "Beginning updates"
        rows = arcpy.UpdateCursor(calc_table, "\"%s\" IS NOT NULL" % joinField)
        for row in rows:
            strata = row.getValue(joinField)
            for sp, per in sp_fields:
                sp_type = row.getValue(sp)
                spp_f = float(row.getValue(per))
                if spp_f > 0:
                    for grp, spec, stem in species:
                        stem_f = strataDict[strata][stem]
                        (...)
    Hopefully that didn't get too convoluted, anyone else have anything that might contribute in terms of optimization?
    Mathew Coyle, EADA10
    GIS Analyst
    Alberta-Pacific Forest Industries Inc.
    ArcGIS 10.2.1 Testing
    ArcGIS 10.1 SP1
    Windows 7 SP1 64-bit

  2. #2
    Kim Ollivier
    Join Date
    Oct 2009
    Posts
    589
    Points
    130
    Answers Provided
    10


    0

    Default Re: Update cursor with joined tables work around w/ dictionaries

    I am using dictionaries to update tables instead of a join more as well.
    I tried refactoring my clumsy lines to use the oneline list comprehension but it turned out to be marginally slower.
    Code:
    222565 dictionary count 0:00:46.594000
    222565 dictionary count 0:00:48.125000
    I note that you do not bother to specify a subset of fields when opening the cursor. If you have a lot of fields it apparently helps a lot to only list the relevant fields for the calculations. Not so easy to generalise I suppose, but it may help with memory management too.
    Has anyone done some tests on the 10.1 da module that has rewritten cursors? Maybe we will not need dictionaries after all.
    Kim Ollivier (EADP101)
    www.ollivier.co.nz kimo@ollivier.co.nz
    "Everywhere is within walking distance
    if you have the time", Steven Wright

  3. #3
    Raphael R
    Join Date
    Apr 2010
    Posts
    109
    Points
    62
    Answers Provided
    8


    0

    Default Re: Update cursor with joined tables work around w/ dictionaries

    Thanks for this!
    had lots of troubles with processing/updating joined tables, took ages within arcmap/didn´t work at all with updatecursors.
    with your suggested dictionaries-route i´ve managed to get it working and really sped things up.

  4. #4
    Chris Snyder

    Join Date
    May 2010
    Posts
    1,262
    Points
    414
    Answers Provided
    38


    0

    Default Re: Update cursor with joined tables work around w/ dictionaries

    Related to this post: http://forums.arcgis.com/threads/583...ry-Compression, I am having troubles when the dictionaries get too big!

    Although it's slower, especially for multiple fields, I am finding the ole' "Join and Calc" method is much more memory efficient.

  5. #5
    Mathew Coyle

    Join Date
    Feb 2011
    Posts
    1,369
    Points
    958
    Answers Provided
    150


    0

    Default Re: Update cursor with joined tables work around w/ dictionaries

    Quote Originally Posted by csny490 View Post
    Related to this post: http://forums.arcgis.com/threads/583...ry-Compression, I am having troubles when the dictionaries get too big!

    Although it's slower, especially for multiple fields, I am finding the ole' "Join and Calc" method is much more memory efficient.
    Yes, I can imagine when you get into storing multiple million tuple datasets to memory on a 32-bit process, you're going to have a bad time. When I implemented mine it was only ~50 rows to reference to the main table, which worked out quite well. I have another process with a 1:1 relationship on the 900k row dataset that I use a join and export process to run calculations on. I hope Esri bites the bullet this decade and converts desktop to a 64-bit application. It's not like datasets or file complexity are shrinking.

    Maybe as a quick fix develop some more easy to use interfaces between desktop and server to submit large geoprocessing jobs to server post 10.1 which utilizes 64-bit python.
    http://forums.arcgis.com/threads/546...ow-about-64bit
    Mathew Coyle, EADA10
    GIS Analyst
    Alberta-Pacific Forest Industries Inc.
    ArcGIS 10.2.1 Testing
    ArcGIS 10.1 SP1
    Windows 7 SP1 64-bit

  6. #6
    Bruce Bacia
    Join Date
    Jun 2012
    Posts
    73
    Points
    45
    Answers Provided
    7


    0

    Thumbs up Re: Update cursor with joined tables work around w/ dictionaries

    It's funny I'm reading this post today.....I just switched one of my scripts from a join and select method to a dictionary method and processing time went down from 2 hours to 8 minutes. Long live the dictionary!

  7. #7
    Bruce Bacia
    Join Date
    Jun 2012
    Posts
    73
    Points
    45
    Answers Provided
    7


    0

    Default Re: Update cursor with joined tables work around w/ dictionaries

    Here's a neat, pythonesque way of removing unwanted field names. Not sure if it will be faster, but it looks cooler!

    flistObj = arcpy.Listfields(strata_tab)
    flist = [f.name for f in flistObj]
    for exclude in ['OID','joinField']:
    flist.remove(exclude)

  8. #8
    Bruce Bacia
    Join Date
    Jun 2012
    Posts
    73
    Points
    45
    Answers Provided
    7


    5
    This post is marked as the answer

    Default Re: Update cursor with joined tables work around w/ dictionaries

    I think this will work, too...even shorter

    flistObj = arcpy.Listfields(strata_tab)
    exclude = ['OID','joinField']
    flist = [f.name for f in flistObj if f.name not in exclude]

  9. #9
    Peter Wilson
    Join Date
    Apr 2010
    Posts
    263
    Points
    16
    Answers Provided
    1


    0

    Red face Re: Update cursor with joined tables work around w/ dictionaries

    Hi Mathew

    I came across your thread and hope that you are able to assist me to use python dictionaries to accomplish what I'm trying to do. Please note that I'm new to Python and would need some assistance to understand your code if you don't mind and have the time.

    I have 7.5 million parcles saved as a feature class. Within the feature class I have a field called "SG_Code". I also have two tables called WARMS (i.e. WARMS_DW760 & WARMS DW764). They each have a field called "SG_Code" & "TIT_DEED_NUM". I then have another two additional tables called RED (i.e. Redistribution) and REST (i.e. Restitution). The RED and REST tables have a two fields "SG_CODE" and "TIT_DEED_NUM".

    I need to create a subset feature class of the 7.5 million parcles where I find a match using firstly the "SG_Code" between the parcles feature class and each WARMS table separately (i.e. WARMS_DW760 then WARMS_DW764). I then need to find a match using the original 7.5 million feature class and RED and REST tables using the "SG_Code". Then I need to find a match based on the match already found using the 7.5 million records between the WARMS_DW760 and WARMS_DW764 and then match the "TIT_DEED_NUM" and the "TIT_DEED_NUM" found in the RED and REST tables to see if I find additional matches using the "TIT_DEED_NUM" as not all the records have "SG_Codes" within the REST and RED tables.

    In short, what I'm trying to accomplish is to identify where I find a match between the parcles and warms, then a match between the parcles and RED and REST.

    I've used Add Joins so far to accomplish this, but its running forever. I've attached my model that I've built so far to better understand what I'm trying to accomplish.

    Regards
    Attached Files
    Last edited by Playa; 10-05-2012 at 09:28 AM.
    Peter Wilson
    GIS Technologist
    Aurecon South Africa (Pty) Ltd

  10. #10
    Chris Snyder

    Join Date
    May 2010
    Posts
    1,262
    Points
    414
    Answers Provided
    38


    1

    Default Re: Update cursor with joined tables work around w/ dictionaries

    Peter - The basic method is shown here: http://forums.arcgis.com/threads/955...ll=1#post30010

  11. #11
    Lisa May
    Join Date
    Aug 2010
    Posts
    1
    Points
    0
    Answers Provided
    0


    0

    Default Re: Update cursor with joined tables work around w/ dictionaries

    I'm creating a dictionary from featureclasses - emassDict =
    {u'1': [2009621.0, 2009622.0, 2009624.0, 2009623.0, 2009625.0, 2009626.0, 2009627.0]}{u'2': [2009633.0]}{u'3': [2009632.0, 2009631.0, 2009630.0, 2009629.0, 2009628.0]}{u'4': [2009617.0, 2009611.0, 2009610.0, 2009614.0, 2009620.0, 2009612.0, 2009616.0, 2009615.0, 2009613.0, 2009618.0, 2009607.0, 2009605.0, 2009619.0, 2009609.0, 2009606.0, 2009608.0]}{u'5': [2009604.0, 2009601.0, 2009600.0, 2009603.0, 2009602.0]}{u'6': [2009100.0]}{u'7': [2009009.0]}{u'8': [2009004.0, 2009005.0, 2009007.0, 2009008.0, 2009001.0, 2009003.0, 2009002.0, 2009006.0]}{u'9': [2009500.0]}

    In this same script I want to update one of the fields "iField" with the values from the dictionary - if the key matches the values in another field "eZoneName". The dictionary is being created, but "iField" is not being populated. I'm not receiving any error messages so it has to be in the logic, but I can't see it. Please help, the total script is here:

    Code:
    eZones = r"C:\temp\NLF.gdb\NLF_EM_2009_Dissolve"
    eZoneName = str("UniqueID")
    iField = "All_EM_List"
    
    eIncidents = r"C:\temp\NLF.gdb\NLF_EM_2009_Identity"
    emNameField = ("E_MASS")
    joinField = "Dissolve_FID"
    arcpy.MakeFeatureLayer_management(eIncidents, "eIncidentsLayer")
        
    with arcpy.da.UpdateCursor(eZones, (eZoneName, iField)) as zoneRows:
        for zone in zoneRows:
            eZoneNameString = zone[0]
            queryString = '"' + eZoneName + '" = ' + "'" + eZoneNameString + "'"
    
            arcpy.MakeFeatureLayer_management(eZones, "CurrenteZonesLayer", queryString)
    
            try:
                arcpy.SelectLayerByLocation_management("eIncidentsLayer", "CONTAINED_BY", "CurrenteZonesLayer")
                
                emassDict = dict()
                for row in arcpy.SearchCursor("eIncidentsLayer"):
                    emName = row.getValue(emNameField)
                    snName = row.getValue(joinField)
                    
                    if snName in emassDict:
                        emassDict[snName].append(emName)
                    else:
                        emassDict[snName] = [emName]
                        
                print emassDict
    
    
                if  eZoneNameString == [snName]:
                    zone[1] = [emName]
                    zoneRows.updateRow(zone)
                    
            except arcpy.ExecuteError:
                print(arcpy.GetMessages(0))
    
            finally:
                arcpy.Delete_management("CurrenteZonesLayer")
    
    arcpy.Delete_management("eIncidentsLayer")
    del zone, zoneRows
    Last edited by lisaesri; 04-01-2014 at 02:38 PM.

  12. #12
    Mathew Coyle

    Join Date
    Feb 2011
    Posts
    1,369
    Points
    958
    Answers Provided
    150


    0

    Default Re: Update cursor with joined tables work around w/ dictionaries

    This is never true so you are not stepping into the update line.

    Code:
    if  eZoneNameString == [snName]:
    I also think you may be confusing lists and dictionaries.
    Mathew Coyle, EADA10
    GIS Analyst
    Alberta-Pacific Forest Industries Inc.
    ArcGIS 10.2.1 Testing
    ArcGIS 10.1 SP1
    Windows 7 SP1 64-bit

+ Reply to Thread

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts