Building an automated software system meant setting up multiple servers with dedicated CPU configuration, memory, storage, and other resources for many years. Next, a team of administrators was formed to manage these systems. Then the development team took over the infrastructure and began to create processes that connect the servers.
This process can be complicated because it involves many different groups working together towards a common goal. These conflicts of interest can then be a problem.
It can also be quite costly. This requires you to have administrators on your payroll. Servers, which run continuously, consume resources despite not being used.
To maintain the best performance over time, you need an auto-scaling solution that automatically scales the server resources.
The cloud platform has one advantage: it allows you to create an end–to–end architecture without the need for server cluster setup. From an administration perspective, there is nothing to maintain.
This is a cost-effective option for startups and the minimum viable product (MVP) phases of projects. It is a good starting point If it is difficult to predict future production loads and user activity. This is where it can be challenging to determine the configuration of cluster servers.
The automation of processes through serverless cloud services is what makes serverless architecture stand out. It connects services and produces results that are similar to traditional cluster servers.
This is an example of building such an architecture using only native AWS services.
Picking Up the Services Serverless Flow
Imagine that you would like to create a platform to gather various data and pictures (or photos) of some concrete assets’ infrastructure (this can be any manufacturing or utility asset).
- In order to make future analytics possible, it is necessary that the incoming data be first ingested.
- After applying business rules, a back-end procedure saves the calculated outputs as normalized information in a relational database.
- An application front-end that displays normalized clean data allows users to view the results.
Let’s examine which components architecture could include.
AWS S3 Buckets
Amazon S3 buckets are a great way to store files or pictures in the AWS cloud. The price of the storage on the S3 bucket is remarkably low. What’s more, introducing an S3 bucket lifecycle policy further lowers this price.
Such a policy will automatically move older files into different classes of S3 buckets, such as an archive or deep archive access. The classes differ then by the speed of access time as well, but for old data, this will be less of an issue. It mainly serves for accessing the archived data in case of an urgent event rather than for standard operations needs.
- You can organize your data in subfolders.
- You should set appropriate permissions restrictions.
- Add tags to buckets to make them easy to identify and for possible use within dynamic S3 bucket policies.
- The bucket is serverless by design. It’s simply a storage space for your data.
An S3 bucket is serverless by design. It’s simply a storage space for your data.
AWS Athena Database
Athena makes it easy to create an AWS basic data lake. It is a database without servers that uses an S3 bucket to store its data. Data organization is maintained by structured file formats such as parquet or comma-separated value (CSV) files. The S3 bucket holds the files, and Athena refers to them whenever processes select the data from the database.
Just be aware that Athena doesn’t support various functionalities otherwise deemed as a standard, for example, update statements. This is why you need to look at Athena as a very simple option.
However, it supports indexing and partitioning. It can also scale horizontally very easily, as this is as complex as adding new buckets to the infrastructure. For simple yet functional data lake creation, this can still suffice in most cases.
For good performance, selecting the best data design with a focus on future use is essential. It is essential to be very clear on the way you wish to select data. Re-creating tables later once they are already existing and filled with lots of data is difficult.
Athena DB is a great choice and a good fit for your goal if you are looking to create a simple and immutable data pool that is easy to scale horizontally over time.
AWS Aurora Database
Athena DB excels at storing uncurated data. This is how you want to store your original content to maximize its future reuse, after all. However, it is slow to provide select results to a front-end app.
One of the best options, mainly from a perspective of easy-to-execute setup, is the Aurora database running in serverless mode.
Aurora is far from a basic database. It is one of the most advanced native relational database solutions in AWS. It is also a highly-complex native relational database solution that improves with every release.
Aurora is unique because it can run in serverless mode, making it stand out from other relational services. This is how the mode works:
- To configure the Aurora cluster, use the AWS console. You will need to specify the standard CPU and RAM levels as well the maximum interval of auto-scale functionality. This will affect the performance that the Aurora cluster can dynamically add or remove. Based on the current utilization of the database, AWS decides to scale up or down.
- The Aurora cluster will not start unless the user or process initiates a real request. For example, when the scheduled batch processing starts. Or if the application performs a back-end API call to retrieve data from a database. The database will automatically open and will remain active for a predetermined time after the request processes are completed.
- The Aurora cluster will automatically shut down if there is no more work in the database.
To emphasize it one more time, serverless Aurora DB runs only when it has to do real work. The automatically started-up cluster will again shut down if it is not processing any work. The actual work is what you pay for and not your idle time.
The serverless Aurora is fully managed by AWS and does not require an administrator.
You can call back-end APIs to reach data stored in databases. These calls allow you to access the actual data in the front-end application. The main optimization of performance on the back-end should be done by the team. You can even further reduce the possibility of slow response in UI if you design effective select statements inside the API calls directly.
AWS Step Functions
Even though all major components of a system are serverless, this does not guarantee a completely serverless architecture. This is possible only if all batch processes between the components are serverless.
AWS Step functions provide the best solution on the AWS cloud. A connected list of AWS Lambda functions makes up the step function. These functions create a flow chart that has clear start and end states. A lambda function, usually written in Python or Node JS languages, is an executable bit of code that processes whatever is needed.
The following is an example of how you might execute a step function:
- AWS triggers an automatic lambda function whenever a new file comes into the S3 folder. After parsing the file, the lambda loads it into Athena. The lambda stores its results either in a CSV format on an S3 bucket (or in a database tracking table) before closing.
- This result is then used by the next lambda to perform the next steps. This might include calling a machine learning model and transforming a subset from the new data into normalized tables. The last step can be to load the data to the Aurora Database.
- A step function links these lambdas together to form a batch flow. It is even possible to have another step function executed in place of a step of another root step function. In this way, It is possible to cover many scenarios.
This serverless flow has one major drawback: each lambda function can only run for 15 minutes as a maximum. Therefore, splitting the flow into smaller lambda functions can make this less problematic.
It is possible to call multiple lambda functions simultaneously in one step, which basically means parallelizing a step with multiple lambdas executed simultaneously. Just wait for all parallel lambda processing to finish before you continue. Then, proceed to the next lambda processing.
Serverless architecture offers a unique opportunity to create a cloud platform that covers the entire system landscape. This platform is horizontally scalable and has low operating costs while doing so.
It is the perfect solution for budget-constrained projects. It is an excellent exploration option, typically when no one knows the reality of the production load. This is especially important after you have successfully onboarded all users. It is possible for project teams to still get an overall view of how the system works. You can have all these benefits and still no need to accept compromises.
This coverage won’t be adequate for all cases, particularly those that involve high CPU usage. However, the AWS cloud is constantly evolving in terms of serverless use cases. It is usually a good idea to conduct thoroughgoing research before you decide on the serverless option for your next AWS cloud project.
Next, check out the best serverless databases for modern applications.