Adding a Revision Number for File Uploads Coldfusion

Before this week, Pablo Fredrikson from our Platform team was paged considering one of the Kubernetes pods that runs one of our Lucee CFML containers was running out of deejay space. Upon further investigation, he found that the server's temporary file directory was using over 160 Gigabytes of storage. To perform an firsthand remediation, my team triggered a deployment for that ColdFusion service, which wiped all of the old data. But, once the "incident" was airtight, I started trying to figure out why so much data was being stored. And, what I discovered is that the temporary files produced during a multipart/grade-data POST are duplicated and persisted if the parent ColdFusion request uses CFThread to manage asynchronous processing.

Usually, when you upload a file to a ColdFusion server during a multipart/class-data POST, ColdFusion automatically deletes the temporary .upload file afterward the request completes. Withal, information technology turns out that if the parent ColdFusion request uses the CFThread tag, the server stops deleting the temporary file associated with the POST. And, in fact, the server creates a duplicate of the temporary file for every CFThread tag that is spawned.

This is non the first time that I've seen CFThread cause some unexpected behavior. Equally I discussed at a few months ago, Lucee CFML appears to incur some asking-cloning overhead when spawning threads - a side-event that I noticed when debugging functioning bug with FusionReactor. I'one thousand assuming that this temporary upload file duplication is related to that same behavior.

To isolate this behavior in a articulate mode, I created 2 ColdFusion files: one that outputs the contents of the temporary directory; and, i that accepts a file-upload. First, let's look at the temporary directory code:

                  <cfscript>  	param proper noun="url.clear" type="boolean" default="false";  	if ( url.clear ) {  		for ( record in directoryList( getTempDirectory() ) ) {  			fileDelete( tape );  		}  		location( url = "./listing.cfm", addToken = false );  	}  	tempFiles = directoryList( 		path = getTempDirectory(), 		listInfo = "query", 		filter = "*.upload" 	);  	tempFiles.sort( "dateLastModified", "desc" );  </cfscript>  <h1> 	Lucee CFML Temp Files </h1>  <cfdump var="#tempFiles#" />  <p> 	<a href="./listing.cfm">Refresh</a> , 	<a href="./listing.cfm?articulate=true">Clear Temp Directory</a> </p>                                  

Equally you can see, all this does it dump-out the contents of the getTempDirectory() directory; and, if prompted, clear the contents.

The other ColdFusion page creates a file-upload, only doesn't do annihilation with information technology; and, if prompted, will spawn several no-performance (noop) CFThreads:

                  <cfscript>  	param proper noun="form.data" blazon="string" default=""; 	param name="form.doSpawnThread" type="boolean" default="false";  	if ( grade.information.len() && form.doSpawnThread ) {  		// Spawn a number of threads so that nosotros can see how the thread-count affects the 		// way that temporary upload files are handled. 		for ( i = 1 ; i <= ten ; i++ ) {  			thread 				proper noun = "noopThread-#i#" 				index = i 				{  				// No-op thread....  			}  		}  	}  </cfscript>  <!doctype html> <html lang="en"> <body>  	<h1> 		Upload A File To Lucee CFML 	</h1>  	<cfif form.information.len()> 		<p> 			<strong>Temp File:</strong> <cfoutput>#class.information#</cfoutput> 		</p> 	</cfif>  	<course method="mail" enctype="multipart/form-data"> 		<p> 			<input type="file" name="information" /> 			<input blazon="submit" value="Upload" /> 		</p> 		<p> 			<input type="checkbox" name="doSpawnThread" value="true" /> 			Spawn <code>CFThread</code> 		</p> 	</class>  </body> </html>                                  

Every bit you tin can run into, there's basically nothing going on here.

And then, let'south first with a base test: using the multipart/form-data POST to upload a file, merely without triggering any CFThread:

Temporary upload files being handled in Lucee CFML.

As yous can meet, when we upload the file using the multipart/form-data Mail, Lucee CFML creates a temporary .upload file. Then, in one case the request is over, that temporary file is cleared-out automatically and our getTempDirectory() listing is left empty.

At present, we're going to perform the aforementioned workflow; only, this time, we're going to spawn ten CFThread tags likewise. Those CFThread tags don't do anything - it's their very existence that causes the change in beliefs:

Temporary upload files being handled in Lucee CFML.

Every bit you tin meet, when the parent request of a multipart/form-information POST spawns a CFThread tag, we get some very interesting behaviors:

  1. The temporary .upload file generated during the upload is no longer removed automatically after the parent request finishes processing.

  2. A new temporary .upload file is generated for each CFThread tag that is spawned from the parent request. In this case, we spawned 10 CFThread tags, so we cease upwardly with 11 .upload files in the getTempDirectory().

  3. The per-CFThread .upload files are all uniquely named; and, nosotros don't take any idea what they are at runtime. I don't evidence information technology in the demo GIF, but all of the class.data references in the CFThread tag bodies prove the parent'southward version of the .upload file.

This clearly explains why some of our Lucee CFML pods were using over 160 Gigabytes of data! Consider that some of the requests that bargain with file-uploads are besides spawning asynchronous CFThread tags to do the following:

  • Generating thumbnails.
  • Generating version-history of avails.
  • Logging analytics information.

... it's no wonder that duplicated bytes are adding upwardly fast!

So, what to do about this? Well, since each duplicated .upload file is given a unique proper name that neither the parent request nor the CFThread tags take access to, I remember the safest approach will be to periodically purge the getTempDirectory() directory of any files that are older than 10-minutes, where X is some reasonable threshold. Perhaps 5-minutes?

Here's what that could look like (though, I have not tested this code) - it gathers all .upload files older than 5-minutes and then deletes them:

                  <cfscript>  	purgeOldTempFiles();  	// ------------------------------------------------------------------------------- // 	// ------------------------------------------------------------------------------- //  	/** 	* I delete ".upload" files from the getTempDirectory() that are older than the given 	* age in minutes. 	*  	* @tempDirectory I am the web server'due south temporary directory. 	* @ageInMinutes I am the historic period at which to consider a .upload file purgeable. 	*/ 	individual void role purgeOldTempFiles( 		string tempDirectory = getTempDirectory(), 		numeric ageInMinutes = 5 		) {  		var tempFiles = directoryList( 			path = tempDirectory, 			listInfo = "query", 			filter = "*.upload" 		);  		var cutoffAt = at present().add( "n", -ageInMinutes );  		// Filter the temp-directory files downward to those that were modified before the 		// cutoff date. These are the files that are considered safe to delete. 		var oldTempFiles = tempFiles.filter( 			( tempFile ) => {  				return( tempFile.dateLastModified < cutoffAt );  			} 		);  		for ( var tempFile in oldTempFiles ) {  			var tempFilePath = ( tempFile.directory & "/" & tempFile.name )  			// Since files can be locked by a process, nosotros don't want 1 "poison pill" to 			// break this process. Each file operation should exist wrapped in a try-catch 			// so that if it fails, we can go along trying to delete other, onetime files. 			endeavour {  				systemOutput( "Deleting temp file: #tempFilePath#", truthful, truthful ); 				fileDelete( tempFilePath );  			} catch ( any error ) {  				systemOutput( error, truthful, true );  			}  		}  	} 	 </cfscript>                                  

This could be triggered every bit office of a scheduled-chore; or, in our case, as part of the Kubernetes (K8) health-check that runs every x-seconds in every container in our distributed system.

Again, I assume that this beliefs is related to the "request cloning" that seems to accept place in Lucee CFML when a CFThread tag is spawned. I assume that part of that request cloning is the cloning any FORM data, complete with temporary files. So, hopefully my idea to purge quondam .upload files is a sufficient fashion to combat the storage growth.

Black Lives Matter

Ad for InVision App, Inc prototying platform.


wilsonposinion2001.blogspot.com

Source: https://www.bennadel.com/blog/3889-temporary-upload-files-are-duplicated-and-persisted-when-a-request-uses-cfthread-in-lucee-cfml-5-3-6-61.htm

0 Response to "Adding a Revision Number for File Uploads Coldfusion"

Postar um comentário

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel