SupervisedTagDiscovery

QueueTask Name Database Id / Name
QueueTask_SupervisedTagDiscovery 1 - SupervisedTagDiscovery (Deprecated)
QueueTask_SupervisedTagDiscoveryParent 10 - SupervisedTagDiscoveryParent
QueueTask_SupervisedTagDiscoveryChild 11 - SupervisedTagDiscoveryChild

This process takes an existing set of Tags and scans MediaFiles looking for new Tag examples that best match the source tags.

Launching a Job

Settings

These are the interface configuration options when launching a Job via the web interface.

Setting Name Description
Name Not used internally. This is meant as a book-keeping item for later reference by the user. For example, a researcher could name each task based on what experiment number we had written out in notes. This makes it easy to find the task and review the settings later.
Discovery Classes The TagClasses to be used for discovery. To search multiple classes at once, select more than one TagClass.
Source TagSets

Which TagSets are used for the source Tags. For example:

  • We have 3 TagClasses; singing, speaking, silence.
  • We have 3 TagSets, say a separate TagSet per person who tagged: Tony, David, Michael.

You can select “Singing” and “Michael” to run the search with only Tags that Michael tagged as Singing. - or - You could select “Singing” and all TagSets to use examples tagged “Singing” by anyone.

Destination TagSet This parameter designates the TagSet destination for the resulting created machine tags.
Number of Tags to Discover How many machine tags to discover per file, per source tag. Choosing “1” tells the system to select only one top match per tag, per file. The total number of new tags will be up to (Number of Tags to Discover * Number of Files * Number of Source Tags)
Spectra Weight This parameter in part defines the similarity function used for comparing the spectra. The function is based on the value of this parameter added to the Pitch Weight, Pitch Energy Weight, and Average Energy Weight parameters to equal 1. Each pixel is how much acoustic energy is present for each band in the Time Frame at a specific frequency. The spectra, therefore, represents a two-dimensional matrix of numbers (without processing) mapped to colors for display. This parameter designates how much value or weight comparing these two-dimensional array of numbers should have in the similarity function. A user should play with changing the combinations of these settings to see what results in the highest accuracy for the problem at hand. More weight on this specific parameter is useful for general matches on both pitch and rhythm. A good starting point is to set this parameter to “1” and the other three to “0”.
Average Energy Weight This parameter in part defines how the similarity function used for comparing the spectra. The function is based on the value of this parameter added to the Spectra Weight, Pitch Weight, Pitch Energy Weight, and Pitch Energy Weight parameters to equal 1. This parameter designates how much value or weight should be given to the similarity function for the average energy changes over time. More weight on this specific parameter is useful for identifying spectra with similar rhythms.
Pitch Weight This parameter in part defines how the similarity function used for comparing the spectra. The function is based on the value of this parameter added to the Spectra Weight, Pitch Energy Weight, and Average Energy Weight parameters to equal 1. This represents parameter designates how much value or weight pitch trace should have in the similarity function. Pitch trace is how the band frequency with maximum energy at any point in time changes over time. More weight on this parameter is useful for identifying differences in higher or lower notes on the spectra or the melody of a person’s speech in a lower or higher voice.
Pitch Energy Weight This parameter in part defines how the similarity function used for comparing the spectra. The function is based on the value of this parameter added to the Spectra Weight, Pitch Weight, and Average Energy Weight parameters to equal 1. For each Time Frame, the system finds the band with the highest energy. Pitch energy weight measures how that maximum energy value changes over time. This parameter designates how much value or weight should be given to that measurement in the similarity function. More weight on this specific parameter is useful for identifying changes in audio volume.
Min Match Performance This specifies the minimum Tag Strength allowed to save new Tags. If the best candidates are below this threshold, they will not be saved. Therefore, it is possible that fewer than Number of Tags to Discover will actually be saved. 0 practically disables this (all Tags saved) and 1 would indicate a perfect match. Practical values tend to be around 0.5 - 0.75 but depend upon the specific project.
Spectra Details
Number of Frequency Bands This parameter designates the number of divisions given to a spectra between minFrequency and maxFrequency. Each band can be thought of as tuning fork or human inner ear hair. Reasonable parameter ranges are between 1 and 3500, because 3500 is the approximate number of hairs in a human inner ear and represents what a human being could possible hear. The more bands you use, the higer resolution of matching and the more power it takes to compute, resulting in a slower response.
Number Time Frames Per Second This parameter designates how often to sample the energy of each hair or tuning fork. The sum of potential energy and kinetic energy for each band is always a constant. For example, if you are trying to identify a quickly changing audio event such as syllables, a reasonable parameter would be approximately 100 times per second or more. If you are trying to identify a sustained audio event such as applause or background noise, a reasonable parameter could be 1 sample per second.
Damping Ratio This parameter represents how quickly the band responds to changes in the audio event. This is equivalent to designating how much “drag” is on the tuning fork or human hair or how quickly the tuning fork will stop ringing. It ranges between 0 and 1. A high damping factor (.9) is best for examining quick rhythmic features. A low damping factor (.001) is better for detecting a faint sound of a mechanical hum or a fan in the background that does not change pitch.
Min. Frequency Of the Frequency Bands you have chosen, this parameter represents the lowest value of vibrations per second (Hz) on the range of bands. In other words, this is the lowest audio frequency to which a tuning fork or hair would respond. E.g., the lowest key on a piano is 27.5 Hz. See http://en.wikipedia.org/wiki/Piano_key_frequencies.
Max. Frequency Of the Frequency Bands you have chosen, this parameter represents the highest value of vibrations per second on the range of bands. In other words, this is the highest audio frequency to which a tuning fork or hair would respond. E.g., the highest key on a piano is 4,186 Hz. See http://en.wikipedia.org/wiki/Piano_key_frequencies.
Advanced (Optional)
Search Within TagClasses Restricts the search to times within all Tags of this TagClass.
Save Best ‘N’ Tags Per File (0 or absent parameter disables this feature) Save only the best ‘N’ Tags per file, regardless of TagClass.
Save Best ‘N’ Tags Per Class Per File (0 or absent parameter disables this feature) Save only the best ‘N’ Tags per TagClass per file.

Job Types

The QueueTask_SupervisedTagDiscovery JobType has been deprecated in favor of the newer task types.

The goal here is not to schedule a single Job to scan an entire library, rather define smaller Tasks for a subset of the MediaFiles within the Project. Each Task can then be allocated to a node, and multiple nodes can participate, each working on their own portion.

There will be a single ‘parent’ Task created for each Supervised Tag Discovery invocation. This will merely be a placeholder, as we will then create multiple ‘child’ Tasks, one for each set of MediaFiles. QueueRunner will grab these child tasks one at a time.

Two Tasks will be used: * QueueTask_SupervisedTagDiscoveryParent - The ‘parent’ task * QueueTask_SupervisedTagDiscoveryChild - The ‘child’ tasks

Job Parameters

Note

Prior to August 2013, there was another set of parameters not included in the JobParameters stored in the database. The TagClasses to search are setup with three additional parameters:

  • TagClassesToSearch - Selected in Start SupervisedTagDiscovery Form
  • tagClass.focusFactor - User set in Start SupervisedTagDiscovery Form
  • tagClass.matchQuality - set in Catalog with Adjust

The MatchQuality and FocusFactor (which TagClasses to search) were stored in the individual TagClass, meaning that this was not a part of the JobParameters, so there is no historical record, and more importantly, the values may changes after the Job is setup.

In the new format, a JobParameter will be entered for each TagClass to search. Any TagClasses not listed will be ignored.

  • searchTagClass - (tagClassId,focusFactor,minimumMatchQuality) (e.g., “1234,1.0,0.50”)
    • No spaces should be present (may or may not be allowed by the backend).
    • ‘’‘tagClassId’‘’ - (Integer) Database Id of the TagClass to search
    • ‘’‘focusFactor’‘’ - (Float) Typically 1.0
    • ‘’‘minimumMatchQuality’‘’ - (Float [-1,1]) - Minimum Match Quality to accept for Tags to save to the database.

There will be two types of parameters; those that are common to all child tasks, stored with the parent task, and those that are specific to each child task.

Parent Task Parameters

Parameter Name Type / Range Description
numberOfTagsToDiscover Int How many Tags of each TagClass and from each MediaFile for which to search. The number of returned Tags would thus be (numberOfTagsToDiscover * Number of Files * Number of TagClasses).
spectraWeight Float When computing correlations, how much to weight (multiply) the spectra computation.
pitchWeight Float When computing correlations, how much to weight (multiply) the pitch computation.
averageEnergyWeight Float (aka, ‘volumeWeight’) When computing correlations, how much to weight (multiply) the spectra computation.
numExemplars Int  
maxOverlapFraction Float [0,1] How much new Tags are allowed to overlap with existing Tags (in Time), as a fraction of the length of the tag (i.e., 0 = no overlap, 1 = complete overlap).
sourceTagSets List of Int Comma-separated list of database IDs
destinationTagSet Int database ID of the TagSet
searchTagClass (Int, Float,Float)

(tagClassId,focusFactor,minimumMatchQuality) Specifies which TagClasses to search, and their individual FocusFactor and MinimumMatchQuality settings (e.g., “1234,1.0,0.50”). No spaces should be present (may or may not be allowed by the backend). Insert multiple JobParameter records to search multiple TagClasses.

  • tagClassId
    (Integer) Database Id of the TagClass to search
  • focusFactor
    (Float) Typically 1.0
  • minimumMatchQuality
    (Float [-1,1]) - Minimum Match Quality to accept for Tags to save to the database.
Spectra Details
numFrequencyBands Int  
numTimeFramesPerSecond Float  
dampingRatio Float  
minFrequency Float  
maxFrequency Float  
Advanced / Optional
searchWithinTagClasses List of Int Comma-separated list of database IDs of the TagClasses in which to restrict the search.
saveBestNTags Int (0 or absent disables this feature) Save only the best ‘N’ Tags per file, regardless of TagClass.
saveBestNTagsPerClass Int (0 or absent disables this feature) Save only the best ‘N’ Tags per TagClass per file.

Child Task Parameters

Parameter Name Type / Range Description
mediaFileIds List of Int Comma-separated list of database IDs of mediaFiles to search.

Priority

QueueTask_SupervisedTagDiscoveryChild will utilize the Job Priority field to re-order the execution of tasks. In the default behavior in the initial implementation, the priority will be set to the number of files within the Parent task. As such, smaller sets of tasks will preempt others in queue.

Advanced Modes

Search Within TagClasses

This features restricts the search space to the time within an existing set of Tags.

For example, say we want to search a library of music only during vocals. We can first tag all of the vocal sections as “Vocals”. When we run the Supervised search, we can choose the “Vocals” TagClass for the restriction, and the search will only run within these existing Tags.

Note

The Tags used to restrict the search are similarly limited by the Source TagSets, so Tags within other TagSets will not be used within the search space.

Note

Be sure that the TagClass you are restricting within (“Vocals” above) is NOT in the Discovery Classes, or the function of preventing overlap will prevent any Tags from being created.

Note

This restricts the search only on time; it ignores any Min/Max Frequency information from the Tags.