So, recently I spun up cAdvisor to provide some metrics for the Grafana dashboard. I created both the docker-compose.yml and prometheus.yml thusly:

prometheus.yml:

spoiler
scrape_configs:
- job_name: cadvisor
  scrape_interval: 5s
  static_configs:
  - targets:
    - cadvisor:8080

docker-compose.yml

spoiler
services:
  prometheus:
    image: prom/prometheus:latest
    container_name: prometheus
    ports:
    - 9090:9090
    command:
    - --config.file=/etc/prometheus/prometheus.yml
    volumes:
    - ./prometheus.yml:/etc/prometheus/prometheus.yml:ro
    depends_on:
    - cadvisor
  cadvisor:
    image: gcr.io/cadvisor/cadvisor:latest
    container_name: cadvisor
    ports:
    - 8080:8080
    volumes:
    - /:/rootfs:ro
    - /var/run:/var/run:rw
    - /sys:/sys:ro
    - /var/lib/docker/:/var/lib/docker:ro
    depends_on:
    - redis
  redis:
    image: redis:latest
    container_name: redis
    ports:
- 6379:6379

Placed them both in /tmp/cadvisor/ and ran docker compose up. All well and good, got some metrics to feed Grafana and all would seem jippity jippity.

Next day I notice Prometheus is off line. Hmm, check everything out. Logs complaining of a missing prometheus.yml. On a hunch I recreated the above prometheus.yml and placed it back in /tmp/cadvisor/, restart Prometheus, and it fires right up no runs, no drips, no errors. Before I uploaded the new prometheus.yml, I notice that there is a directory now named prometheus.yml in /tmp/cadvisor/, which is empty. Deleted it.

Next day, same scenario. Missing prometheus.yml, directory called prometheus.yml in /tmp/cadvisor/. I thought well, if it’s getting deleted, change the permissions, and continued my daily affairs.

Today, same exact scenario. So, wtf, over? Run some commands:

stat /tmp/cadvisor/prometheus.yml
sudo lsof /tmp/cadvisor/prometheus.yml
grep "delete" /var/log/syslog

I can see that the file IS being deleted, but I cannot seem to trace down what is deleting it. It’s like there is a cron job that fires off every day at a certain time and deletes prometheus.yml, and in it’s place, creates a directory called prometheus.yml effectively taking Prometheus offline. I have no such cron job tho.

Any ideas? Suggestions? Ancient wizardry? Any mystical incantations or tomes to consult?

  • cass80@programming.dev
    link
    fedilink
    English
    arrow-up
    2
    ·
    7 hours ago

    My guess is the cleanup process is running as root and clobbers anything it sees regardless of permissions. But that’s a guess. I’ve never tried keeping long term data in tmp.