Resolving a Django “Memory Leak” Problem on Long-Running Intensive Processes

Data conversions tend to be straightforward but detail-oriented processes that push systems in unexpected places. When importing several hundred thousand records into a Django development environment — entailing reshaping a number of important data sets — memory use continually escalated until the system slowed to a crawl.

The root cause is the environment is in debug mode, and Django connections log the queries that are executed. The import executes millions of database statements and runs for nearly a half hour. I prefer to not continually fiddle with the configuration file in my development environment, so  I did the following.

For any given data dump, every 60 seconds it prints a status of its progress: number of records completed, total number of records in the dump, and percent completed. The SQL statement log is not interesting, and is merely consuming memory, so it’s flushed thusly:

from django import db

This has resolved the creeping memory problem. If there were any other unusual circumstances requiring releasing resources, this would be an opportune time.



This entry was posted in Programming and tagged , , , , , . Bookmark the permalink.

Leave a Reply