Adding a Revision Number for File Uploads Coldfusion
Before this week, Pablo Fredrikson from our Platform team was paged considering one of the Kubernetes pods that runs one of our Lucee CFML containers was running out of deejay space. Upon further investigation, he found that the server's temporary file directory was using over 160 Gigabytes of storage. To perform an firsthand remediation, my team triggered a deployment for that ColdFusion service, which wiped all of the old data. But, once the "incident" was airtight, I started trying to figure out why so much data was being stored. And, what I discovered is that the temporary files produced during a multipart/grade-data
POST are duplicated and persisted if the parent ColdFusion request uses CFThread
to manage asynchronous processing.
Usually, when you upload a file to a ColdFusion server during a multipart/class-data
POST, ColdFusion automatically deletes the temporary .upload
file afterward the request completes. Withal, information technology turns out that if the parent ColdFusion request uses the CFThread
tag, the server stops deleting the temporary file associated with the POST. And, in fact, the server creates a duplicate of the temporary file for every CFThread
tag that is spawned.
This is non the first time that I've seen CFThread
cause some unexpected behavior. Equally I discussed at a few months ago, Lucee CFML appears to incur some asking-cloning overhead when spawning threads - a side-event that I noticed when debugging functioning bug with FusionReactor. I'one thousand assuming that this temporary upload file duplication is related to that same behavior.
To isolate this behavior in a articulate mode, I created 2 ColdFusion files: one that outputs the contents of the temporary directory; and, i that accepts a file-upload. First, let's look at the temporary directory code:
<cfscript> param proper noun="url.clear" type="boolean" default="false"; if ( url.clear ) { for ( record in directoryList( getTempDirectory() ) ) { fileDelete( tape ); } location( url = "./listing.cfm", addToken = false ); } tempFiles = directoryList( path = getTempDirectory(), listInfo = "query", filter = "*.upload" ); tempFiles.sort( "dateLastModified", "desc" ); </cfscript> <h1> Lucee CFML Temp Files </h1> <cfdump var="#tempFiles#" /> <p> <a href="./listing.cfm">Refresh</a> , <a href="./listing.cfm?articulate=true">Clear Temp Directory</a> </p>
Equally you can see, all this does it dump-out the contents of the getTempDirectory()
directory; and, if prompted, clear the contents.
The other ColdFusion page creates a file-upload, only doesn't do annihilation with information technology; and, if prompted, will spawn several no-performance (noop) CFThreads
:
<cfscript> param proper noun="form.data" blazon="string" default=""; param name="form.doSpawnThread" type="boolean" default="false"; if ( grade.information.len() && form.doSpawnThread ) { // Spawn a number of threads so that nosotros can see how the thread-count affects the // way that temporary upload files are handled. for ( i = 1 ; i <= ten ; i++ ) { thread proper noun = "noopThread-#i#" index = i { // No-op thread.... } } } </cfscript> <!doctype html> <html lang="en"> <body> <h1> Upload A File To Lucee CFML </h1> <cfif form.information.len()> <p> <strong>Temp File:</strong> <cfoutput>#class.information#</cfoutput> </p> </cfif> <course method="mail" enctype="multipart/form-data"> <p> <input type="file" name="information" /> <input blazon="submit" value="Upload" /> </p> <p> <input type="checkbox" name="doSpawnThread" value="true" /> Spawn <code>CFThread</code> </p> </class> </body> </html>
Every bit you tin can run into, there's basically nothing going on here.
And then, let'south first with a base test: using the multipart/form-data
POST to upload a file, merely without triggering any CFThread:
As yous can meet, when we upload the file using the multipart/form-data
Mail, Lucee CFML creates a temporary .upload
file. Then, in one case the request is over, that temporary file is cleared-out automatically and our getTempDirectory()
listing is left empty.
At present, we're going to perform the aforementioned workflow; only, this time, we're going to spawn ten CFThread
tags likewise. Those CFThread
tags don't do anything - it's their very existence that causes the change in beliefs:
Every bit you tin meet, when the parent request of a multipart/form-information
POST spawns a CFThread
tag, we get some very interesting behaviors:
-
The temporary
.upload
file generated during the upload is no longer removed automatically after the parent request finishes processing. -
A new temporary
.upload
file is generated for eachCFThread
tag that is spawned from the parent request. In this case, we spawned 10CFThread
tags, so we cease upwardly with 11.upload
files in thegetTempDirectory()
. -
The per-
CFThread
.upload
files are all uniquely named; and, nosotros don't take any idea what they are at runtime. I don't evidence information technology in the demo GIF, but all of theclass.data
references in theCFThread
tag bodies prove the parent'southward version of the.upload
file.
This clearly explains why some of our Lucee CFML pods were using over 160 Gigabytes of data! Consider that some of the requests that bargain with file-uploads are besides spawning asynchronous CFThread
tags to do the following:
- Generating thumbnails.
- Generating version-history of avails.
- Logging analytics information.
... it's no wonder that duplicated bytes are adding upwardly fast!
So, what to do about this? Well, since each duplicated .upload
file is given a unique proper name that neither the parent request nor the CFThread
tags take access to, I remember the safest approach will be to periodically purge the getTempDirectory()
directory of any files that are older than 10-minutes, where X
is some reasonable threshold. Perhaps 5-minutes?
Here's what that could look like (though, I have not tested this code) - it gathers all .upload
files older than 5-minutes and then deletes them:
<cfscript> purgeOldTempFiles(); // ------------------------------------------------------------------------------- // // ------------------------------------------------------------------------------- // /** * I delete ".upload" files from the getTempDirectory() that are older than the given * age in minutes. * * @tempDirectory I am the web server'due south temporary directory. * @ageInMinutes I am the historic period at which to consider a .upload file purgeable. */ individual void role purgeOldTempFiles( string tempDirectory = getTempDirectory(), numeric ageInMinutes = 5 ) { var tempFiles = directoryList( path = tempDirectory, listInfo = "query", filter = "*.upload" ); var cutoffAt = at present().add( "n", -ageInMinutes ); // Filter the temp-directory files downward to those that were modified before the // cutoff date. These are the files that are considered safe to delete. var oldTempFiles = tempFiles.filter( ( tempFile ) => { return( tempFile.dateLastModified < cutoffAt ); } ); for ( var tempFile in oldTempFiles ) { var tempFilePath = ( tempFile.directory & "/" & tempFile.name ) // Since files can be locked by a process, nosotros don't want 1 "poison pill" to // break this process. Each file operation should exist wrapped in a try-catch // so that if it fails, we can go along trying to delete other, onetime files. endeavour { systemOutput( "Deleting temp file: #tempFilePath#", truthful, truthful ); fileDelete( tempFilePath ); } catch ( any error ) { systemOutput( error, truthful, true ); } } } </cfscript>
This could be triggered every bit office of a scheduled-chore; or, in our case, as part of the Kubernetes (K8) health-check that runs every x-seconds in every container in our distributed system.
Again, I assume that this beliefs is related to the "request cloning" that seems to accept place in Lucee CFML when a CFThread
tag is spawned. I assume that part of that request cloning is the cloning any FORM
data, complete with temporary files. So, hopefully my idea to purge quondam .upload
files is a sufficient fashion to combat the storage growth.
wilsonposinion2001.blogspot.com
Source: https://www.bennadel.com/blog/3889-temporary-upload-files-are-duplicated-and-persisted-when-a-request-uses-cfthread-in-lucee-cfml-5-3-6-61.htm
0 Response to "Adding a Revision Number for File Uploads Coldfusion"
Postar um comentário