+ Reply to Thread
Results 1 to 18 of 18

Thread: End-of-line (EOL) Problem

  1. #1
    Nicholas Jacob
    Join Date
    Apr 2010
    Posts
    44
    Points
    5
    Answers Provided
    0


    0

    Default End-of-line (EOL) Problem

    Came across a weird EOL Error with Arcpy's CalculateField_management() function today and I was wondering if anyone out there could point me towards a work around?

    I previewed one of my tables in Catalog and at first glance the offending cells seemed fine. However, when I copy/pasted one over to a text file a lot more text became visible. My current theory is that the Enter key is being hit when my customers edit cells in MS Excel (my typical source data) and invisible newline characters are being carried into my geodatabase tables. Tricky thing is I can't see them in Excel or Arc, so I'm not sure how to strip or replace the backslashes via Python.

    Any advice would be much appreciated,
    - Nick

  2. #2
    mark denil
    Join Date
    Apr 2010
    Posts
    517
    Points
    284
    Answers Provided
    36


    0

    Default Re: End-of-line (EOL) Problem

    Try dumping the table to a text file, if you think there is stuff going on in there you cannot see.

    There is an 're' module in python that handles regular expressions. That is the tool set you want for weeding out pesky newlines ('\n').

    It may be easier to weed them in the table or in the text dump (which could then be re-imported to a table).

  3. #3
    Nicholas Jacob
    Join Date
    Apr 2010
    Posts
    44
    Points
    5
    Answers Provided
    0


    0

    Default Re: End-of-line (EOL) Problem

    Quote Originally Posted by mdenil View Post
    Try dumping the table to a text file, if you think there is stuff going on in there you cannot see.

    There is an 're' module in python that handles regular expressions. That is the tool set you want for weeding out pesky newlines ('\n').

    It may be easier to weed them in the table or in the text dump (which could then be re-imported to a table).
    Thanks for the quick response! I copied a cell over to Notepad++ and it turns out each line is proceeded with a carriage return and line feed (CR LF). Ordinarily, I would just clean-up data by hand, but in this case I need a script to filter this sort of thing. Have you ever seen this done via field calculator by chance? I've been experimenting with regular expressions, but no luck so far. I keep getting "EOL while scanning string literal" related errors. One of my many attempts below:


    Code:
    #ESRI Codeblock
    codeblock="""def trimNewline(val):
        import re
        newVal = re.sub('(?m)[\r\n]',"",val)
        return newVal"""
    
    #Expression parameter
    expression = "trimNewline(str(!FIELD!))
    
    # CalculateField_management(in_table,field,expression,{expression_type},{code_block})
    arcpy.CalculateField_management(mytable,"FIELDNAME",expression,"PYTHON",codeblock)

  4. #4
    Curtis Price

    Join Date
    Oct 2009
    Posts
    1,798
    Points
    874
    Answers Provided
    127


    1

    Default Re: End-of-line (EOL) Problem

    There's a nice python string method that trims whitespace around strings:

    Code:
    >>> x
    '\r\nhere is my real text\n'
    >>> x.strip()
    'here is my real text'
    So one approach you could use would be:

    Code:
    arcpy.CalculateField_management(mytable,"FIELDNAME","!FIELDNAME!.strip()","PYTHON")
    One could also use VBScript:
    Code:
    arcpy.CalculateField_management(mytable,"FIELDNAME","Trim([FIELDNAME]")

  5. #5
    Nicholas Jacob
    Join Date
    Apr 2010
    Posts
    44
    Points
    5
    Answers Provided
    0


    1

    Default Re: End-of-line (EOL) Problem

    -------
    Update
    -------

    Just a quick update to anyone that may come across this post down the road (and big thank you to mdenil and curtvprice). The short-term conclusion I've come to is that it's not possible to feed hidden newline characters ('\n') into the Field Calculator. Also, because carriage returns and newlines are special, some functions can't access them like I had anticipated, such as strip().

    Workaround was to convert the problem strings to hexadecimal and swap out values with an Update Cursor. See example below:

    Code:
    rows = arcpy.UpdateCursor(fc)
    for row in rows:
        if len(row.NAME) >= 255:
            hexString = str(row.NAME).encode("hex")
            if "0a" in hexString: # "0a" is hex equivalent of '\n'
                hexString = hexString.replace("0a","")
        row.NAME = hexString.decode("hex")
        rows.updateRow(row)
    Definitely not ideal, but seems reliable so far. Another interesting thing is that my problem carriage returns magically disappeared somewhere earlier in my script, when converting an Excel worksheet to an in-memory feature class. This might suggest another arcpy function stripped them out, but not sure which one.

  6. #6
    Warren Roe
    Join Date
    Mar 2011
    Posts
    11
    Points
    1
    Answers Provided
    0


    1

    Default Re: End-of-line (EOL) Problem

    This is a useful thread. I've run in to related problems in how to find and correct them (VB vs Python). When performing Python field calculations on fields containing \n, \x, etc, I get the same errors as Nick described--but I don't get them when using VB (because \n et al aren't special in VB?). So a few points:

    1) I didn't even realize string fields could support carriage returns! Apparently this is new as of 9.2? For example, my address string field can contain:

    568 N Courier Ct
    and I can type more here
    and here if I use "ctrl+enter"
    or if the incoming table
    had \n or \r for line breaks

    ... but when I visually inspect the cell, all I see is "568 N Courier Ct" since it's on the first line. It is not visually apprent without starting an edit session, highlighting text, and dragging and/or arrowing. This is a real hazard when importing from text files or Excel; this could really drive somebody nuts when geocoding imported addresses.

    2) Besides \n, I've also run in to problems with \x in the fields as well. I guess any of the reserved characters would be problematic. ArcGIS SQL is a little helpful for spotting these, in that

    SELECT * FROM street_features WHERE
    "street" LIKE '%
    %'

    finds all instances of carriage returns in the "street" field ... but I have no idea how to write a similar query to find instances of \x, \b, etc.

    3) Nick's hexidecimal conversion seems to work for \n but, again, how can we apply it to other special characters? Or is it just easier to use VB instead of Python for field calculations?
    Last edited by wroe; 07-27-2012 at 03:20 PM.

  7. #7
    Nicholas Jacob
    Join Date
    Apr 2010
    Posts
    44
    Points
    5
    Answers Provided
    0


    0

    Default Re: End-of-line (EOL) Problem

    Cool idea using SQL to search for carriage returns. Later on, I came to the conclusion that I shouldn't bother stripping out the carriage returns, or any of the other random problems with the data for that matter. At some point I needed to shift the responsibility of data maintenance over to the end user of whatever model or script I was writing. So, in the end I think I included some exception handling to specifically watch for them.

    As for the never ending programming language of preference debate, does it really matter? Who honestly writes more than a handful of scripts or models per year anyways?

  8. #8
    Curtis Price

    Join Date
    Oct 2009
    Posts
    1,798
    Points
    874
    Answers Provided
    127


    2
    This post is marked as the answer

    Default Re: End-of-line (EOL) Problem

    Quote Originally Posted by wroe View Post
    Or is it just easier to use VB instead of Python for field calculations?
    I believe (and I think many would agree) that Python has far superior string manipulation capabilities.

    Here are some pretty good Python methods for stripping non printables:

    http://stackoverflow.com/questions/9...ring-in-python
    http://stackoverflow.com/questions/1...ring-in-python

  9. #9
    Jan van Linge
    Join Date
    Dec 2011
    Posts
    1
    Points
    0
    Answers Provided
    0


    0

    Default Re: End-of-line (EOL) Problem

    Quote Originally Posted by wroe View Post
    3) Nick's hexidecimal conversion seems to work for \n but, again, how can we apply it to other special characters? Or is it just easier to use VB instead of Python for field calculations?
    I ran into the same problem when using python to add hyperlinks. As hyperlinks contain "\" characters it sometimes happened that a "\n" was in the hyperlink. I solved it by passing in the hyperlink as a raw string instead of a normal string:

    hyperlink = "c:\somehyperlink\name_of_file"
    arcpy.CalculateField_management(TableToEdit, "HYPERLINK_FIELD", r"r'" + hyperlink + r"'", "PYTHON")

  10. #10
    Curtis Price

    Join Date
    Oct 2009
    Posts
    1,798
    Points
    874
    Answers Provided
    127


    1

    Default Re: End-of-line (EOL) Problem

    Quote Originally Posted by janvanlinge View Post
    I ran into the same problem when using python to add hyperlinks. As hyperlinks contain "\" characters it sometimes happened that a "\n" was in the hyperlink. I solved it by passing in the hyperlink as a raw string instead of a normal string:

    Code:
    hyperlink = "c:\somehyperlink\name_of_file"
    arcpy.CalculateField_management(TableToEdit, "HYPERLINK_FIELD", r"r'" + hyperlink + r"'", "PYTHON")
    The above code embeds a newline (\n) in the string:

    Code:
    >>> print "c:\somehyperlink\name_of_file"
    c:\somehyperlink
    ame_of_file
    I think this may work better:

    Code:
    >>> print 'r"{0}"'.format(r"c:\somehyperlink\name_of_file")
    r"c:\somehyperlink\name_of_file"

    Code:
    hyperlink = r"c:\somehyperlink\name_of_file"
    arcpy.CalculateField_management(TableToEdit, "HYPERLINK_FIELD", '{0}"'.format(hyperlink), "PYTHON")
    I thought I'd add one more thing to this post: how to specify newlines in the Calculate Field tool in ModelBuilder. The interactive tool dialog parser converts "\n" to real newlines in the code box, which doesn't work, so the workaround is to to use chr(10). I've used this as a quick and dirty way to have model builder print a message:

    Expression: msg()

    Code:
    def msg():
      # text = "\n\nThis is\na message to you.\n"  # does not work
      text = "{0}{0}This is{0}a message to you.{0}".format(chr(10))
      return text
    Last edited by curtvprice; 11-07-2012 at 08:03 AM.

  11. #11
    Marc Nakleh
    Join Date
    Sep 2011
    Posts
    60
    Points
    2
    Answers Provided
    0


    0

    Default Re: End-of-line (EOL) Problem

    I do not believe that this problem has been sufficiently adressed.

    Using the Field Calculator box, with text containing a CR-LF (Characters 10-13), I have tested all of the following:
    • Using a replace ('\r\n', '')
    • Using strip
    • Using filter
    • Using a comprehension that decompiles and checks individual characters
    Even using a Codeblock, these tactics did not seem to work.

    However, if I go from the console and set up something like:
    Code:
    rows = arcpy.UpdateCursor(fc)
    for row in rows:
        if '\r\n' in row.TextString:
             row.setValue('TextString', row.TextString.replace('\r\n', ' '))
             rows.updateRow(row)
        del row
    del rows
    It works exactly as one would expect. But I would love to know more about why this doesn't seem to work from the Field Calculator window.
    Marc Nakleh
    Cartographer, Digital Product Design
    TrakMaps


    www.TrakMaps.com

  12. #12
    I B
    Join Date
    Nov 2012
    Posts
    99
    Points
    4294967295
    Answers Provided
    3


    0

    Default Re: End-of-line (EOL) Problem

    Yeah, still not working.

  13. #13
    Curtis Price

    Join Date
    Oct 2009
    Posts
    1,798
    Points
    874
    Answers Provided
    127


    1

    Default Re: End-of-line (EOL) Problem

    Quote Originally Posted by mnakleh View Post
    However, if I go from the console and set up something like:
    Code:
    rows = arcpy.UpdateCursor(fc)
    for row in rows:
        if '\r\n' in row.TextString:
             row.setValue('TextString', row.TextString.replace('\r\n', ' '))
             rows.updateRow(row)
    del row, rows
    It works exactly as one would expect. But I would love to know more about why this doesn't seem to work from the Field Calculator window.
    The problem is that you cannot use Python escape codes like "\r" in the Field Calculator code block or the Calculate Value code block. I'm assuming this has something to do with the parsing of python arguments into string representation in the arcpy/gp messaging framework.

    If you need to access escape characters, use the chr() function instead.

    This will probably work fine:

    Code:
    rows = arcpy.UpdateCursor(fc)
    for row in rows:
        newline = chr(13) + chr(10)
        if newline in row.TextString:
             row.setValue('TextString', row.TextString.replace(newline, ' '))
             rows.updateRow(row)
    del row, rows
    Last edited by curtvprice; 04-10-2013 at 11:33 AM. Reason: del row, rows should both be outside loop

  14. #14
    Ian Broad
    Join Date
    Aug 2012
    Posts
    30
    Points
    6
    Answers Provided
    1


    0

    Default Re: End-of-line (EOL) Problem

    Quote Originally Posted by curtvprice View Post
    The problem is that you cannot use Python escape codes like "\r" in the Field Calculator code block or the Calculate Value code block. I'm assuming this has something to do with the parsing of python arguments into string representation in the arcpy/gp messaging framework.

    If you need to access escape characters, use the chr() function instead.

    This will probably work fine:

    Code:
    rows = arcpy.UpdateCursor(fc)
    for row in rows:
        newline = chr(13) + chr(10)
        if newline in row.TextString:
             row.setValue('TextString', row.TextString.replace(newline, ' '))
             rows.updateRow(row)
        del row
    del rows
    Thanks Curtis, I'll give that a shot.

  15. #15
    Marc Nakleh
    Join Date
    Sep 2011
    Posts
    60
    Points
    2
    Answers Provided
    0


    0

    Default Re: End-of-line (EOL) Problem

    Hello Curtis,

    I think I was too vague: the cursor example I provided works fine. However, NOTHING I have tried in the Field Calculator box worked. Even if I replace references of '\r\n' to (chr13) + chr(10), it still doesn't work.

    One of the responses on GIS StackExchange describing the exact same problem recommends the same thing as you, and those who tried it seemed to have just as little luck as I did.

    I set up a quick test just to make sure that I was isolating the issue:
    1. In a new shapefile, I add 2 text fields (TEXTFIELD, NEWTEXTF)
    2. I create a single feature
    3. I type the following text in Notepad: "This is a[ENTER]test" (where [ENTER] represents pressing the Enter button)
    4. I copy-paste this text (which is on two lines) into the feature's TEXTFIELD value
    5. I then run the following in FieldCalculator: NEWTEXTF = !TEXTFIELD!.upper()

    This generates the following error message:
    Executing: CalculateField test NEWTEXTF !TEXTFIELD!.upper() PYTHON_9.3 #
    Start Time: Thu Jul 18 12:35:16 2013
    ERROR 000539: Error running expression: "This is a
    test".upper() <type 'exceptions.SyntaxError'>: EOL while scanning string literal (<string>, line 1)
    Failed to execute (CalculateField).
    Failed at Thu Jul 18 12:35:16 2013 (Elapsed Time: 0,00 seconds)


    Any attempts to replace the newlines, using either either escape sequences or chr() calls, result in the same error.
    It looks as if the CalculateField is passing along the newlines unescaped, which breaks the interpreter.

    So, a couple of questions come to mind:
    1. Do you get the same behaviour as me for the basic case of !TEXTFIELD!.upper()?
    2. If yes, does this mean that ALL CalculateField calls that use the Python interpreter need to have their input sanitized to remove newlines? Or that we should just switch to Cursors in all cases to avoid any errors or difficulties?
    3. Could you paste the actual working code you used to get the example working properly?

    If you'd prefer, we can correspond directly by e-mail too, so I can send you the samples I have.

    Thanks so much!
    Marc Nakleh
    Cartographer, Digital Product Design
    TrakMaps


    www.TrakMaps.com

  16. #16
    Kris Sedmera
    Join Date
    May 2010
    Posts
    2
    Points
    0
    Answers Provided
    0


    0

    Angry Re: End-of-line (EOL) Problem

    Dang it!

    Why does the field calculator for Python reject standard strings with \t , \n, etc. characters? The only one it seems to accept is \r.

    e.g. f.write(!textfield1!+', '+!textfield2!+'\r') works
    but f.write(!textfield1!+',\t'+!textfield2!+'\t') doesn't work

    This bug severely limits the possibilities for writing output from a table to a file or an email.

    What is the point of castrating Python's string operators?

  17. #17
    Curtis Price

    Join Date
    Oct 2009
    Posts
    1,798
    Points
    874
    Answers Provided
    127


    0

    Default Re: End-of-line (EOL) Problem

    Quote Originally Posted by mnakleh View Post
    Could you paste the actual working code you used to get the example working properly?
    Code:
    import os
    import arcpy
    
    tbl = arcpy.CreateScratchName("","","table","in_memory")
    arcpy.CreateTable_management("in_memory",os.path.basename(tbl))
    arcpy.AddField_management(tbl,"TESTFIELD","TEXT")
    Rows = arcpy.InsertCursor(tbl)
    Row = Rows.newRow()
    Rows.insertRow(Row)
    del Row, Rows
    arcpy.CalculateField_management(tbl,"TESTFIELD","chr(10) + chr(13)","PYTHON_9.3")
    print arcpy.GetMessages()
    Rows = arcpy.SearchCursor(tbl)
    Row = Rows.next()
    print "Field value: ",repr(Row.TESTFIELD)
    del Row, Rows
    Results:

    Code:
    Executing: CalculateField in_memory\xx0 TESTFIELD "chr(10) + chr(13)" PYTHON_9.3 #
    Start Time: Mon Aug 05 10:49:33 2013
    Succeeded at Mon Aug 05 10:49:33 2013 (Elapsed Time: 0.00 seconds)
    Field value:  u'\n\r'
    Also this worked fine for me:

    Click image for larger version

Name:	cf_test.PNG
Views:	43
Size:	11.9 KB
ID:	26465

    I think the problem you're running into is using the value of the field !TESTFIELD! if the field contains newlines - the tool will substitute in the value of the field into the expression - the geoprocessing messaging and Python interpreter can't deal with this.

    I think the method you found is a good approach, that is, using Python with cursors (using the Calculate Value tool if in ModelBuilder) instead of using Calculate Field. I don't see another way around this, which is a design issue with the way Calculate Field accesses field values, and its connection with the geoprocessing message environment in how things are passed to Python.

    Seems to me this is a good enhancement request for Calculate Field, i.e. have non printables converted to escape codes as part of the !FIELDNAME! -> value substitution process.
    Last edited by curtvprice; 08-05-2013 at 09:12 AM.

  18. #18
    Curtis Price

    Join Date
    Oct 2009
    Posts
    1,798
    Points
    874
    Answers Provided
    127


    1

    Default Re: End-of-line (EOL) Problem

    Quote Originally Posted by ksed View Post
    Why does the field calculator for Python reject standard strings with \t , \n, etc. characters?

    What is the point of castrating Python's string operators?
    This is a limitation of the ArcGIS geoprocessing tool setup, which has to pass all tool parameters in string representation. The expression and code block are Calculate Field tool parameters. (String reps are easily used to pass tool parameters across the web, XML, etc.)

    The fix is to use the chr() function instead of escape codes in your expression or Calculate Field code block.

+ Reply to Thread

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts