# Demo DVC Data Registry **Repository Path**: charlize/demo-dvc-data-regis ## Basic Information - **Project Name**: Demo DVC Data Registry - **Description**: No description available - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2025-10-20 - **Last Updated**: 2025-10-20 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # DVC HTTP Client Demo A simple Python data science project managed with `uv` and DVC, with data tracked on an HTTP remote. ## Setup Instructions ### 1. Initialize Project ```bash # Initialize uv project uv init --name dvc-http-client-demo --app # Create data directory mkdir -p data ``` ### 2. Create Sample Data Create `data/employees.csv` with sample employee data: ```csv name,age,salary,department Alice,28,75000,Engineering Bob,32,85000,Marketing Charlie,25,65000,Engineering Diana,35,95000,Management Eve,29,72000,Marketing Frank,31,88000,Engineering Grace,27,68000,HR Henry,33,92000,Management Ivy,26,70000,HR Jack,30,82000,Engineering ``` ### 3. Initialize DVC ```bash # Initialize git repository git init # Add DVC as dev dependency uv add --group dev dvc # Initialize DVC repository uv run dvc init # Configure DVC remote uv run dvc remote add -d myremote http://10.160.43.82:5000/ ``` ### 4. Track Data with DVC ```bash # Add CSV file to DVC tracking uv run dvc add data/employees.csv # Commit DVC metadata to git git add data/.gitignore data/employees.csv.dvc git commit -m "Add employee data to DVC tracking" ``` ### 5. Configure Authentication If you get 401 unauthorized errors when pushing: ```bash # Method 1: Embed credentials in URL (recommended for HTTP servers) uv run dvc remote modify myremote url http://username:password@10.160.43.82:5000/ # Method 2: Configure credentials in DVC config uv run dvc remote modify myremote user YOUR_USERNAME uv run dvc remote modify myremote password YOUR_PASSWORD # Test connection curl -u username:password http://10.160.43.82:5000/ ``` ### 6. Push Data to Remote ```bash # Push data to DVC remote uv run dvc push ``` ## Usage ### Run Data Analysis ```bash # Add pandas dependency uv add pandas # Run the analysis script uv run python analyze_data.py ``` ### Check DVC Status ```bash # Check DVC status uv run dvc status # Check data status uv run dvc data status # List DVC-tracked files uv run dvc list . ``` ## Project Structure ``` ├── data/ │ ├── employees.csv # Data file (managed by DVC) │ ├── employees.csv.dvc # DVC metadata file │ └── .gitignore # DVC-generated gitignore ├── .dvc/ # DVC configuration ├── .venv/ # Virtual environment ├── analyze_data.py # Data analysis script └── pyproject.toml # Project configuration ``` ## Troubleshooting ### 401 Unauthorized Error 1. Verify credentials with curl: ```bash curl -u username:password http://10.160.43.82:5000/ ``` 2. Try embedding credentials in URL 3. Check if server requires token authentication ### Data Not Synced - Check `uv run dvc status` - Ensure `.dvc` files are committed to git - Run `uv run dvc push` to sync with remote