Components
The four constituent parts of a Protocol, and by extension of any Deidentifier.
Tags, Filters, Pixel and Private together define the complete handling of any incoming DICOM Dataset.
Filters
Checks any dataset and either accepts it for further processing or rejects it. Common reasons for rejection are unknown DICOM with burnt in information, non-conformant DICOM or unknown SOPClass.
Filters can be applied at multiple times in a deidentification process. Particularly, it can reject outright from the start, but can also be called after Pixel is called, as Pixel can change the tag ‘PatientIdentityRemoved’ which is a potential input to Filters.
The Filters component is solely responsible for rejecting datasets. Not other component can do this.
Syntax
A filter is defined in the form of a boolean or propositional truth function. For example:
<Modality == "MR"> and <Manufacturer contains "Company A"> -> Reject
Relationships between propositions are purely standard logical connectives
andornotand parenthesis( )for grouping.Each proposition in the formula is a boolean function over a tag value. The test performed inside a preposition can be of any form, as long as the outcome is boolean (yes/no).
The outcome the formula is always
Reject yes/no
For a deidentifier, Filters will be implemented to be actually runnable. For a protocol, Filters can be written down in any formal language that implements boolean logic.
Pixel
Processes all Image data elements. So that PHI is removed from the images. This includes burnt-in text, implant serial numbers and faces.
The tag PatientIdentityRemoved can be set by Pixel and not touched by Tags processing.
Syntax
The protocol Pixel processing definition differs for the two types of pixel-based PHI Burnt-in image PHI and Dynamic image PHI.
For burnt in PHI
For Burnt-in image PHI, pixel is processing is defined like a boolean function using only the tags from the Image Type ID subspace followed by one or more square pixel regions to black out. For example:
<Modality == "MR"> and <Manufacturer contains "Company A"> ->
[0,0,512,30], [0,400,512,30]
The format for a black-out region is [top, left, size-x, size-y] where top
and left are the pixel coordinates of the top left of the region, counting from
the top left of the image (top left of the image = (0,0)), and size-x and size-y
are the size of the box in pixels.
Note
In the future, pixel data processing will probably move to OCR-type techniques where text is recognized in any image regardless of its ‘type’. This will make the currently described approach unneeded. Any list of type -> black out region can then still be useful for testing purposes.
For dynamic image PHI
For Dynamic image PHI, there is no set method or syntax. A protocol should document whether any dynamic image PHI should be removed. This should be a human-readable description. There is no set format for this.
For a deidentifier the description should include a description of the methods used, if any. The evidence should make it
Private
Private tag handling is boils down to maintaining a list of ‘safe private’ tags. The DICOM standard allows indicating whether a deidentification method retains safe private tags (option ‘Rtn. Safe Priv. Opt’ in table E.1-1). The standard does not define which private tags are considered safe. Several lists are maintained by several organizations.
Syntax
If a protocol retains safe private tags, these are defined as a list of private tags deemed safe. For example:
0013,["Company_A"]01
0013,["Company_A"]02
0075,["Company_B"]01
0075,["Company_B"]0e
0075,["Company_B"]31
Looking at the first example 0013,["Company_A"]01 in detail:
0013is the element group number
Company_Ais the value of the private creator tag
01is the last part of the element number (first part is dynamically set by private creator tag)
See Private tag for more information on private tag structure.