Objects

Objects at different levels of deidentifying a DICOM dataset. Deidentification comes down to changing the right parts of a DICOM dataset. The change to each DICOM element in a single dataset can be characterized as a delta set. A DeltaSet is effected by a deidentifier. A deidentifier is a concrete implementation of a Protocol.

@startuml
!includeurl  https://raw.githubusercontent.com/plantuml-stdlib/C4-PlantUML/master/C4_Context.puml

'# remove <<system>> above each block
HIDE_STEREOTYPE()

System(protocol, "protocol", "Defines handling of any incoming DICOM dataset") #121111

System(deid1, "deidentifier", "Implements a protocol") #A8A8A8
System(deid2, "deidentifier", "Implements a protocol") #4B4B4A
System(deid3, "deidentifier", "Implements a protocol") #A8A8A8

deid1 .up[#A8A8A8]. protocol
deid2 .up. protocol
deid3 .up[#A8A8A8]. protocol

deid1 -[hidden]> deid2
deid3 -[hidden]l-> deid2

System(delta1, "DeltaSet", "Change/Delta for each DICOM tag") #CECECE
System(delta2, "DeltaSet", "Change/Delta for each DICOM tag") #7D7D7D
System(delta3, "DeltaSet", "Change/Delta for each DICOM tag") #CECECE

delta1 .up[#CECECE]. deid2
delta2 .up. deid2
delta3 .up[#CECECE]. deid2

delta1 -[hidden]> delta2
delta3 -[hidden]l-> delta2

System(ds1, "Dataset", "DICOM dataset") #E6E6E7
System(ds2, "Dataset", "DICOM dataset") #A7A6A6
System(ds3, "Dataset", "DICOM dataset") #E6E6E7

ds1 .up[#CECECE]. delta2
ds2 .up. delta2
ds3 .up[#CECECE]. delta2

delta1 -[hidden]> delta2
delta3 -[hidden]l-> delta2

@enduml

Deidentification objects from abstract (top) to concrete (bottom)

Protocol

Defines how to handle the deidentification of any incoming dataset using four Components: Filters, Tags, Pixel and Private. It does not say anything about implementation, it only prescribes what should be done to each part of a dataset and under which circumstances to reject it outright.

A single protocol can be implemented by many deidentifiers.

Deidentifier

A piece of software that takes a DICOM Dataset and removes PHI from it. It does this via four Components: Filters, Tags, Pixel and Private.

A deidentifier can do one of two things with an incoming dataset:

  1. Reject the dataset trough triggering one of the filters

  2. Apply a transformation to the dataset. The transformation is defined in the Tags, Pixel and Private components. The observed changes in the tags form a Deltaset

A deidentifier implements a deidentification protocol. Multiple deidentifiers can implement the same protocol.

Contrary to a Protocol, a deidentifier is a concrete implementation. It will have to actually implement a protocol’s abstract Action Codes. For action codes like REMOVE this is trivial, just remove the dicom element. But for CLEAN many different operations might be said to implement ‘cleaning’. It is up to the creators of a deidentifier to defend the choice for an implementation in a given context.

Deltaset

A set of observed changes to dataset elements. See the Spaces and Codes page for a full description

Like this:

Tag Name

Value Before

Value After

Delta

PatientName

SMITH^JOHN

Patient01

CHANGED

Modality

CT

CT

UNCHANGED

Study Date

20240315

<tag not found>

REMOVED

Manufacturer

Company A

<empty>

EMPTIED

De-identification Method

<tag not found>

deidentifier B

CREATED

Dataset

A standardized container that stores a medical image along with associated metadata. Each dataset contains both pixel data (the actual medical image) and a comprehensive set of information tags that describe patient details, acquisition parameters, and clinical context. Each element in a dataset consists of a tag, tag description and value. For example:

Tag

Description

Example Value

(0010,0010)

Patient’s Name

SMITH^JOHN

(0010,0020)

Patient ID

MRN12345678

(0010,0030)

Patient’s Birth Date

19700101

(0008,0020)

Study Date

20240315

(0008,0060)

Modality

MR

(0008,0070)

Manufacturer

Medical systems LTD

(0008,0090)

Referring Physician’s Name

JONES^SARAH^M.D.

(0020,000D)

Study Instance UID

1.2.840.10008.1.2.3.4

DICOM datasets can be stored as files, in databases or in memory.