A shared file system for lambda functions

(aws.amazon.com)

👤petercooper🕑5y🔼173🗨️109

(Replying to PARENT post)

I'm curious what people actually use Lambda for?

I tried Lambda for a use-case that I had in 2018:

We published Polls and Predictions to people watching the 2018 World Cup. We set the vote callback URL to a function on AWS Lambda.

It failed spectacularly during our load-testing because the ramp-up period was far too slow. We needed to go from 0 to 100,000 incoming requests/second in about 20 seconds.

We had to switch to an Nginx/Lua/Redis endpoint because Lambda was just completely unusable. It would have cost us $27,000/month to pre-provision 10,000 concurrent executions...

What is it that people actually use Lambda for?

👤VWWHFSfQ🕑5y🔼0🗨️0

(Replying to PARENT post)

I've found EFS enticing in theory but painfully slow and riddled with issues in practice. In the past I've tried it thinking "it's basically an EBS volume I can mount on > 1 EC2 instance," only to find terrible read performance and misc. low-level NFS errors.

👤kleebeesh🕑5y🔼0🗨️0

(Replying to PARENT post)

Serverless is a bit like Stone Soup [1]. This I guess is the point at which the Tramp says: "Now if you just add a few onions it really helps the flavour..."

[1] https://en.m.wikipedia.org/wiki/Stone_Soup

👤scandox🕑5y🔼0🗨️0

(Replying to PARENT post)

I could see this being very useful.

I recall Joyent's solution to this (similar) problem where you have an object stored somewhere (e.g. S3) and you want to use that object in a container, but you have to copy it over HTTP or something to do any work on it and the object could be very large.

With Joyent's Manta[1] you would spin up a container right where an object is stored (instead of bringing the objects to the container via NFS.) Also has map reduce support.

[1] https://apidocs.joyent.com/manta/jobs-reference.html

👤actionowl🕑5y🔼0🗨️0

(Replying to PARENT post)

This is a horrible idea. This gives lambda functions shared mutable state to interfere with each other, with very brittle semantics compared to most databases (even terrible ones).

👤jupp0r🕑5y🔼0🗨️0

(Replying to PARENT post)

I worked with EFS but not lambda in 2017-2018 when migrating an app to AWS - an app which included a bunch of random application code that assumed it could read or write into a network file system. Having EFS as a migration target to replace on prem CIFS was relatively pleasant, which removed the need to rewrite a bunch of the application code. S3 would have been a reasonable replacement but that would have required weeks or months of rewrites to hunt down filesystem calls and rework them to use simpler object store API.

One thing that tripped us up at the time was EFS not supporting encryption in transit: but this was fixed in early 2018 when EFS began supporting using stunnel to wrap the underlying NFS connection in TLS. https://docs.aws.amazon.com/efs/latest/ug/encryption-in-tran...

It reads as if this lambda integrated EFS works out of the box with encryption in transit

👤shoo🕑5y🔼0🗨️0

(Replying to PARENT post)

Does anybody have a price comparison between this and storing stuff in S3 bucket and loading it all the times?

👤matteuan🕑5y🔼0🗨️0

(Replying to PARENT post)

OMG THIS IS SO AMAZING I HAVE BEEN WANTING THIS FOR AN ENTIRE YEAR NOW. (I've been using Lambda to do massively distributed compile jobs, but had reached the throughput limits I could achieve with distcc-like techniques doing local preprocessing, and so was looking at doing limited synchronization of my codebase to S3 to then either link against the compiler or, for other tools and to let me use the gold standard compiler I want, do C runtime injection to make it so that when files are opened I pull them from S3... but that entire process sucked and doesn't really solve the general purpose problem: this does; this lets me trivially do the moral equivalent of make -j1000 and have all of the random sub-jobs get executed in lambda functions and have the compile complete nearly instantaneously. I can even have those jobs just directly share state and do "exactly what you'd expect" with respect to the inter-dependency stuff <- which like, is a tradeoff, but one that fits well with how most projects are already designed when using make... I'm so pumped to go back and work on that project again.)

👤saurik🕑5y🔼0🗨️0

(Replying to PARENT post)

Disclosure: I work on Google Cloud.

It wasn't obvious to me if this is somehow mounting EFS over !NFS (since never says the words NFS in the post). My main fear when people say "Should I use Lambda / Google Cloud Functions / Cloud Run against my NFS server" my response isn't "How would you set that up" it's "Be careful. Cleaning up NFS locks held by clients that have gone away is fairly painful, and you have none of the mechanisms to make sure it exits properly".

Alternatively, you can mount without locking, and then you get one of the comments downthread about "and now you've given functions shared mutable state but with bad primitives".

tl;dr: Cool! ... But, how does this handle NFS locking?

👤boulos🕑5y🔼0🗨️0

(Replying to PARENT post)

Pretty sweet.

Is anybody using Lambda to run huge MapReduce jobs? Do people still use Hadoop?

Doesn't this basically just let you have something like HDFS for running large distributed computations with some shared state, without having to reach for S3 or redis?

👤ralusek🕑5y🔼0🗨️0

(Replying to PARENT post)

sharp knife to hand people -- because EFS is just NFS, it uses NFS for security / isolation. Everything that can mount a given volume needs to agree on what unix users are what, and you need to make sure to completely lock down root access, otherwise you can't enforce any kind of data isolation.

If your use case can deal with one EFS volume per isolation boundary, you can use IAM to control who can mount what volume, which might be easier to reason about.

Cloud-y DLP tools don't know about EFS.

👤philsnow🕑5y🔼0🗨️0

(Replying to PARENT post)

Very cool functionality, but so much complexity to set up, connect, and manage all the various services.

👤anderspitman🕑5y🔼0🗨️0

(Replying to PARENT post)

Ooh. I wonder if efs is compatible with sqlite’s nfs mode.

More seriously, this is huge. Unix pipes over shared nfs has always been my big data platform of choice (since before the cloud, or even google map reduce). Things finally came full circle.

👤hedora🕑5y🔼0🗨️0

(Replying to PARENT post)

Cool, so we're staring to curve more sharply around the full circle we'll eventually go on.

So now lambda functions can mount persistent block storage.

Next up: allow your lamda functions to run for longer

Then: allow multiple lamda functions to execute concurrently, and indefinitely, as a group, while having block storage mounted

And finally: use your EC2 instances as lambda functions

👤Thorentis🕑5y🔼0🗨️0

(Replying to PARENT post)

What is the advantage of using this over S3? Is it just the speed & latency difference between S3 and EFS?

👤abrookewood🕑5y🔼0🗨️0

(Replying to PARENT post)

This is misdirection by naming, like how University of Phoenix is similar to Arizona State University, Phoenix. "Lambda functions" sounds like anonymous functions, but is actually referring to a proprietary interface to AWS Lambda™. They named it this way so that readers confuse AWS Lambda™ for programming lambdas.

👤Konohamaru🕑5y🔼0🗨️0

(Replying to PARENT post)

Curious if it adds significantly to cold start times.

👤tyingq🕑5y🔼0🗨️0

(Replying to PARENT post)

This is going to be a really interesting way to benchmark EBS performance.

👤whalesalad🕑5y🔼0🗨️0

(Replying to PARENT post)

Can we please change the title? I was really afraid that AWS might have lost it and called the new file system "new", which must be the least practical name for anything from an SEO standpoint.

👤black_puppydog🕑5y🔼0🗨️0

(Replying to PARENT post)

"lambda functions" not "lambda expressions" - It is funny (or not) how Amazon redefines how a term is used by naming a product. The words actually make no sense, but people do not care and say It anyway "lambda function".

👤zelphirkalt🕑5y🔼0🗨️0