Yukon Place Names

Canadian geographic placenames board now publishes all their data for free via a variety of formats and services on the Canadian Geographical Names Service (CGNS) (yay!). I decided to try and build a script which could be run once a year or on an as-needed basis to update a Yukon Gazateer. The automation part was a failure, but the data part is okay. What follows are my notes to myself, so I don’t know how much you’ll be able to get out of it.  — Matt Wilkie

There are 3,937 placenames in the database. Some are withdrawn or recinded though. You’ll need to consult the user guide and the specifications for what that means ( GNSS_Users_Guide.pdfhttp://cgns-dev.nrcan.gc.ca/cgns_web/standards_spec.html )

Yukon_Placenames.shp is the results of my efforts. It’s pretty much ready to use. Some work remains to be done to substitute the special characters for labeling. (See update from 15-nov-2007 at end of page)

Original_yk-names.txt is the original data as downloaded and before cleaning.

yk-names_cleaned.csv is the cleaned and now true CSV file.

yk-geonames_cvs.shp is the cleaned file converted into a point shapefile.

yk-geonames_gml.shp is the output from the Web Feature Server. The CSV and the GML files have the same records but different, and useful, attributes so ideally they should be merged together. That’s a whole ‘nother project though.

Core_fields.txt has all the nitty gritty details on the attribute schemas and values.

The download archive is yk_placenames_distrib.zip and about 2mb.

If you don’t care what trials and tribulations created this dataset stop reading now. 🙂

Update

15 November 2007

We can use Gentium, Charis & Doulos fonts for the accurate rendering of the native placenames, espcially with this helpful character picker as a selection tool: http://people.w3.org/rishida/scripts/pickers/latin/ Soooo much easier than any other method I’ve seen to find the characters one needs! Characters are shown in order of visual similarity. No more constant jumping back and forth from one section to another trying find that special X! (use the special ones a the bottom too, and copy/paste the results!)

Next task: script to convert geonames {32} codes to the appropriate stacked diacriticals: ǭ̈

 


Ugly Details

To download the entire Yukon in CSV format, use this url http://gnss.nrcan.gc.ca/gnss-srt/api?bbox=-142.0,59.0:-123.0,72®ionCode=60&output=csv (be nice to their server. We don’t need to be getting it more than once or twice a year. Also be patient. It takes about three minutes for the entire file to be sent). Saved asoriginal_yk-names.csv

Huh. The data is there but not in csv format there are pipe symbols By as field delimeters tab (|) and html line breaks for record delimters (<br> ). A fairly simple job for regular expression search and replace if you have a decent text editor. Fixed version: yukon-placenames.csv. I submitted a bug report in July and one the developers responded. I gave some more detail and haven’t heard back. When I checked again this morning CVS output was still broken. Oh, there are data problems too. Things like 105O typed as 105 zero.

Went looking for a script to easily convert lat/long d??ng to utm. Haven’t found an ArcGIS one yet, but this python library is very easy to use: http://pygps.org/. Now I need to figure out how to tell it to get the UTM zone by itself. There’s this one too: http://starship.python.net/crew/jhauser/Gproj.html

What about GDAL/OGR? asked fwtools mailng list. Answer from Frank Warmerdam:

The OGR Projections Tutorial might be helpful for you, though it mostly
addresses stuff from the C++ point of view. http://www.gdal.org/ogr/osr_tutorial.html_

The Python script http://www.gdal.org/srctree/pymod/samples/tolatlong.py
_should show a bit of how to use projections stuff in Python. In your

case you want to go from lat/long to utm. There is nothing pre-baked
in OGR to identify the optimal UTM zone for a given point, but it is
relatively easy to find the nearest central meridian since they are all
in six degree increments.

Sorry I don’t have something a bit more specific!

bah humbug. If ArcCatalog starts crashing everytime you start it, before it even finishes drawing the gui, try deleting/renaming %appdata%/ESRI/ArcCatalog/ArcCatalog.gx. Ahhh, there’s a better fix: just rename/move the last opened directory or add a new data file to it. (bug logged, incident #75755)

Code to calc UTMX/Y for a point shapefile loaded in ArcMap.

Procedure: set data frame coordinate system to desired UTM zone > Select only those points in the Zone (requires point-on-poly overlay with utm_zones poly) > Open Attributes > Select UTM_X column (which is Longitude) > Calc Values > Advanced > paste code block from below > Set Output to equal X or Y depending on which column you are doing. Lather, Rinse, Repeat until done. (courtesy of http://forums.esri.com/Thread.asp?c=93&f=982&t=54791#135972)

 

dim pMxDoc as imxdocument
set pMxDoc = thisdocument
dim pMap as IMap
set pMap = pMxDoc.focusmap
dim pGeometry as IGeometry
set pGeometry = [Shape]
pGeometry.Project  (January)  pMap.SpatialReference
dim pPoint as IPoint
set pPoint = pGeometry
X = pPoint.X
Y = pPoint.Y

code to grab yukon names from the CGNS web feature server:

http://www.cubewerx.com/cwpost/cwpost.cgi?serverUrl=http://cgns-dev.nrcan.gc.ca/cgi-bin/cubeserv.cgi?service=wfs%26datastore=cgns&postBody=Paste%20your%20transaction%20here

What the heck am I doing trying to convert broken CSV to shape, when they have a server which can spit the same thing out already baked wholesale nba jerseys into a spatial format? This will chop Excel/OpenOffice Calc out of the loop and then we won’t have to fix the broken NTS names (those fine programs like to change 105e15 into 1.05e+15)

<?xml version="1.0" encoding="ISO-8859-1" ?>
<GetFeature srsName="EPSG:4269">
<Query typeName="GEONAMES">
<Filter>
<PropertyIsEqualTo>
<PropertyName>REGION_CODE</PropertyName>
<Literal>60</Literal>
</PropertyIsEqualTo>
</Filter>
</Query>
</GetFeature>

The user guide says one can stick output=”SHAPE” in the GetFeature line, but I get an error with that:

<?xml version="1.0" encoding="ISO-8859-1"?>
<ServiceExceptionReport version="1.1.3" xmlns=" http://www.opengis.net/ows "
xmlns:xsi=" http://www.w3.org/2001/XMLSchema-instance "
xsi:schemaLocation=" http://www.opengis.net/ows http://schemas.cubewerx.com/schemas/wms/1.1.3/ServiceExceptionReport.xsd ">
<ServiceException>
CubeSERV-00002: Syntax error detected in XML stream "(stdin)" on line 2 char pos
               46 (raised in function CwXmlScanText_ReadString() of file
               "cw_xmlscan.c" line 2243)
</ServiceException>
<ServiceException>
CubeSERV-00002: Hit unexpected character #x94 while  Elevation  scanning XML token (raised
               in function CwXmlScanText_ReadString() of file "cw_xmlscan.c"
               line 2200)
</ServiceException>
</ServiceExceptionReport>

Oh, that’s why not use WFS. It’s broken (or I’m not using it properly. Okay, spending too much time on that. Go back to kludge-ville and regex search & replace the NTS names:
open yk_names_25jul2006.dbf in Excel, copy NTS column to Vim (we really should be doing this in python to make it easily repeatable), then:

# match 115A08  w?adze  and delete last trailing two digits
/\(\d\d\d\a\)\d\d/\1/g
# strip MCR130
/mcr130,//g
# fix  wholesale nba jerseys  false exponents (delete periods and trailing +0##)
/\.//g
/+\d\d\d//g
# change incorrect 105zerozerozero... to 105o 
/00\+/O/g

Next problem is to merge dupes (105a,105a,105a,105b —> 105a,105b). Hmmm. I think I’ve gone beyond what’s easy in vim, and now there’s no choice but to learn the python way.

Going back and looking at some of the intermediate WFS request outputs, I see that there is inconsistency in the attributes. This needs a more studied look but the one of immediate relevance is that there is a Relevance At Scale (r_vlaue) field which in the API cvs file is filled with many blanks while the GML output for that field is fully populated. That’s enough to tell me it is foolish to rely on the cvs as Backdoor an authoritative source, so I’m backing up and going to start from the GML.

Try#2 at downloading from Geonames WFS server

1. download with wget, using example command line from section 4.3 of the GNSS User Guide. It failed before because of line length limitations in CMD. Workaround is to save the command into a text file and run with:

wget -O output_file.gml -i http_command.txt

2. We use ogr2ogr to convert GML to shape, but shape has an attribute name length limit. To get the proper attribute names in Arc we need to dance around a little: Use ogrinfo output_file.gml to generate attribute schema(output_file.gfs), edit .gfs and strip leading “GEONAMES.” from each <Name>. Then convert to shape using ogr2ogr which will generate an empty shapefile with cheap mlb jerseys the correct headings. . Edit the .dbf file with Excel and cheap mlb jerseys copy the column headings. Undo the edits to .gfs (or delete it all together), and convert again to shape. Open the second .dbf in Excel and copy the proper field headings, save and exit.

ogrinfo yk_names.gml
vim yk_names.gfs # :%s/GEONAMES\.//g; save
ogr2ogr -a_srs EPSG:4269 yk_names yk_names.gml
# excel yk_names/geonames.dbf; copy 1st row; close dbf
del yk_names.gfs
ogr2ogr -a_srs EPSG:4269 yk_names/ yk_names.gml
# excel yk_names/geonames.dbf; paste 1st row; close dbf

and that’s all for now folks!

Matt.Wilkie@gov.yk.ca

Geographic Information,
Information Management and Technology,
Yukon Department of Environment
10 Burns Road * Whitehorse, Yukon * Y1A 4Y9
867-667-8133 Tel * 867-393-7003 Fax
http://environmentyukon.gov.yk.ca/geomatics/

3 thoughts on “Yukon Place Names

  1. Hi Matt,
    Have you been able to access the WFS or WMS for Canadian Geographic Names recently?
    When I try to open it in Arc or Qgis it just comes up empty and there are error messages that suggest that the server is down or files have been moved.
    *this is my first time trying to use this type of service, so it could be an error on my side possibly.

    Thanks!

Leave a Reply

Your email address will not be published. Required fields are marked *

To create code blocks or other preformatted text, indent by four spaces:

    This will be displayed in a monospaced font. The first four 
    spaces will be stripped off, but all other whitespace
    will be preserved.
    
    Markdown is turned off in code blocks:
     [This is not a link](http://example.com)

To create not a block, but an inline code span, use backticks:

Here is some inline `code`.

For more help see http://daringfireball.net/projects/markdown/syntax