Aug 12, 2024
		
	
	
		Revelio Wage and Salary Base Data
The Revelio Labs wage and salary estimate model has been applied to the RAW dataset to provide a baseline and backfill of salary and wage estimates up to 3/23/23. Use the backfilled_date_time files to base RAW with wage and salary data then append using daily wage and salary files joinable by hash.
FTP file path: /
PartnerProducts/SalaryData/backfilled_date_time/S3/GCP/Azure file path: 
.../SalaryData/salary_information_acl/backfilled_date_time/Description: Backfill of wage and salary estimates by hash up to March 23rd, 2023, split across 32 parquet files. Joinable to RAW job records by hash.
File Name Structure: data_#_#_#
Format: .parquet
NOTE: It is recommended to use the last version of each salary mapping using the date_time field, however, salary data can be mapped point-in-time if desired.
| Field Name | Data Type | Description | 
| hash | STRING/VARCHAR | The unique identifier for job records. This is used to join wage and salary data to job records. | 
| mean_salary | FLOAT | The average salary. | 
| lower_bound | FLOAT | The lower bound of the salary within a confidence range. | 
| upper_bound | FLOAT | The upper bound of the salary within a confidence range. | 
| date_time | VARCHAR | The date and time this hash was mapped to this salary. This value will change if the salary datapoints are updated. | Used to track each time a salary mapping is updated and can be used to find the latest salary value for a hash. | {date}_{hour}_{minute} (GMT time zone) | 
Revelio Wage and Salary Daily Files
FTP file path: /
PartnerProducts/RevelioLabs/salary_information_acl/YYYY-MM-DD_##_##/S3/GCP/Azure file path: 
/salary_information_acl/YYYY-MM-DD_##_##/Description: Daily wage and salary files, joinable to RAW job records by hash.
Folder Structure: YYYY-MM-DD_##_##/
File Name Structure: data_#_#_#
Format: .parquet
NOTE: It is recommended to use the last version of each salary mapping using the date_time field, however, salary data can be mapped point-in-time if desired. 
| Field Name | Data Type | Description | 
| hash | STRING/VARCHAR | The unique identifier for job records. This is used to join wage and salary data to job records. | 
| mean_salary | FLOAT | The average salary. | 
| lower_bound | FLOAT | The lower bound of the salary within a confidence range. | 
| upper_bound | FLOAT | The upper bound of the salary within a confidence range. | 
| date_time | VARCHAR | The date and time this hash was mapped to this salary. Used to track each time a salary mapping is updated and can be used to find the latest salary value for a hash. | {date}_{hour}_{minute} (GMT time zone) | 
Related: Click here to access the Revelio Labs Wage and Salary Data Methodology Guide
