Join Jeopardy

Joining a table to features is an incredibly useful tool. It saves on duplication and double (or triple!) data entry and lets us focus on getting data right in only one place and then rest easy knowing everything else is also correct automatically.

However it’s also easy to go wrong if you’re in a rush or otherwise not in a mind space to pay attention to detail. I’ve been doing this stuff for approaching two decades and still this morning I almost sent something broken out for public distribution. I had been so focussed on getting accurate numbers in the labels I forgot to verify the polygons were coloured correctly. In my mind that part was completed finished last week! (My only task this morning was to fix a single label.)




So what happened?

The selection shown (in cyan) is a from Select By Attribute query:

SELECT * from Game_Management_Subzones_tick_data_hide_summary WHERE tick_data.Ticks_Y_N = 'Y'

Table tick_data is Joined to feature class Game_Management_Subzones on GMA_ID, which they both have in common. The join is one-to-many; there are multiple tick_data records for each GMA.

The Select By Attribute query cycles through all the matching joined tick_data records, HOWEVER layer Symbol by Unique Value stops on the first matching record found.

Paraphrasing About joining and relating tables:

…[in a report] you see multiple records, one for each corresponding match. The multiple matches are also seen when:

  • using a join field while symbolizing a joined layer,
  • labeling,
  • identifying features,
  • generating a graph,
  • and using either the Find or Hyperlink tool.
  • [Also when] using the joined layer as input to a geoprocessing tool or in an export operation, the multiple matching records are used.

Caution: In all cases of 1:M joins, only the first matching record is joined and displayed in the layer’s attribute table

[And when symbolizing as we see here, and yes this is contrary to the official documentation; maybe the limitation is specific to By Unique Value]


 So what’s the fix?

In this case it’s to Summarize before joining. (Other data might need different pre-processing.) From tick_data on GMA_ID and save maximum value from ‘Total_ticks’ column, join, and symbolize on Max Ticks (‘0′ is “No ticks found” and All other values’ is “Ticks found”).



Thanks to Meghan Larivee in the Animal Health Unit (Environment Yukon) for the data and opportunity to learn anew how to (almost) not shoot myself in the foot. –Matt.

Leave a Reply

Your email address will not be published. Required fields are marked *