Connect Your Procore Data by Downloading Analytics Models (Beta)
Objective
To connect your Procore data to a BI system by downloading Analytics models programmatically.
Things to Consider
- Requirements:
- Python 3.6 or higher installed on your system.
- 'config.share' file received from Procore.
- The necessary Python packages installed on your system.
- The script supports both PySpark and Python.
- If you are using PySpark, make sure you have installed Spark 3.5.1 or later, Java and configured the SPARK_HOME environment variable.
- If you are using Python and the target location is MSSQL DB, install the ODBC driver 17 for SQL Server on your system.
Steps
- Download Credentials File
- Run user_exp.py script
- Run as PySpark
- Run as Python
- Choose Your Own Method
Download Credentials File
- Create a file called 'config.share.
- Add the fields below:
{
"shareCredentialsVersion": 1,
"bearerToken": "",
"endpoint": "",
"expirationTime": ""
} - Add the bearerToken, endpoint, shareCredentialsVersion and expirationTime values received from Procore to the config.share file.
Run user_exp.py script
You can use the following scripts to create a config.yaml file with the necessary configurations.
- For Azure Storage:
cron_job: #true/false
run_as: #pyspark/python
source_config:
config_path: #path to the config.share file
tables:
- '' # table name if you want to download a specific table. Leave it empty if you want to download all tables
source_type: delta_share
target_config:
auth_type: service_principal
client_id: #client_id
secret_id: #secret_id
storage_account: #storage-account name
storage_path: #<container>@<storage-account>.dfs.core.windows.net/<directory>
tenant_id: #tenant_id
target_type: azure_storage
- For MSSQL DB:
cron_job: #true/false
run_as: #pyspark/python
source_config:
config_path: #path to the config.share file
tables:
- '' # table name if you want to download a specific table. Leave it empty if you want to download all tables
source_type: delta_share
target_config:
database: #target database
host: #target hostname:port
password: #password
schema: #target schema (default to procore_analytics)
username: #username
target_type: sql_server
Run as PySpark
If your environment is already set up with Spark, choose the 'pyspark' option when requested or once the 'config.yaml' is generated, you can run the following commands to download the reports to the data directory.
- For Writing to ADLS Gen2 Storage:
spark-submit --packages io.delta:delta-sharing-spark_2.12:3.1.0,org.apache.hadoop:hadoop-azure:3.4.0,com.microsoft.azure:azure-storage:8.6.6,org.apache.hadoop:hadoop-common:3.4.0 --exclude-packages com.sun.xml.bind:jaxb-impl delta_share_to_sql_spark.py - For Writing to MSSQL DB:
spark-submit --packages io.delta:delta-sharing-spark_2.12:3.1.0 --jars <Location of mssql-jdbc jar> delta_share_to_sql_spark.py
Run as Python
- From the command line, navigate to the folder by entering “cd <path to the folder>” command.
- Install required packages using “pip install -r requirements.txt” or “python -m pip install -r requirements.txt”.
- Open SSIS and create a new project.
- From the SSIS Toolbox drag and drop Execute Process Task.
- Double click Execute Process Task.
- Go to the Process tab.
- Next to Executable, enter the path to python.exe in the Python installation folder.
- In WorkingDirectory, enter the path to the folder containing the script you want to execute (without the script file name).
- In Arguments, enter the name of the script delta_share_to_azure_panda.py you want to execute with the .py extension and click Save.
- Click Start in the top ribbon menu.
- During the execution of the task, the output of the Python console is displayed in the external console window.
- Once the task is done it will display a checkmark.
Choose Your Own Method
Delta Sharing is an open protocol for secure data sharing. You can find the public GitHub repository for Delta Sharing at https://github.com/delta-io/delta-sharing. The repository includes examples and documentation for accessing shared data using various languages such as Python and Spark Connector (SQL, Python, Scala, Java, R).