Written by Arjen Schwarz
The Tuesday at re:Invent is the day of the big keynote and it likely doesn’t come as a surprise that once again a big focus of the day was generative AI, with a sprinkling of other exciting features. And from this large number of announcements, I needed to pick five. The 5 I picked are ones that I find interesting, but I highly recommend checking out the rest of the announcements.
Trn2 instances and UltraServers
New instance types are always fun, and this is especially the case when they’re this impressive. The new Trn2 instances are specifically developed to train ML models on using the second generation Trainium chips. These chips are 4x faster, offer 4x more memory bandwidth, and have 3x more memory capacity than the Trn1 instances. The instances come with 16 Trainium chips, as well as 192 vCPUs, 2 TiB of memory, and 3.2 Tbps network bandwidth. That makes these pretty impressive powerhouses.
But, AWS also offers what they call an UltraServer variant of this. This combines 4 Trn2 instances into a single low-latency instance for even more power. To show off the power of this setup, they announced that Anthropic will be developing their next model using the Trn2 instances.
The one downside with these instances right now is that they’re only available in Ohio, and that you need to reserve them in advance using EC2 Capacity Blocks for ML.
Q Developer agent improvements
Q Developer (once known as CodeWhisperer) is meant for code generation and integrates with your IDE. This means that it is helpful as a tool to quickly generate the code you need as a developer. In addition it offers a chat agent that you can ask questions regarding your codebase. The announcements today add several new features to the agent to make your life as a developer easier.
First among these is the ability to generate documentation. It does this by creating or updating a README file in your project. Based on some initial testing it seems to do a decent job at this. Of course, as with all code generation, this won’t take away the need to look over the result and expand or modify as needed. One caveat with this though is the quotas. It is only able to update a README file of up to 15KB and an uncompressed project size of 200MB. As I already ran into the README file limit on one of my personal project, this limit seems a bit low.
Secondly is the ability to detect issues. This will scan your code and show any issues it finds. Based on some initial testing, it finds common issues, but the way that you can find the details of the issues was a bit unexpected. I recommend reading the documentation, because the issues will be reported into your IDE and show up with any other issues that your IDE may detect instead of in the chat interface. This is the right place to put the findings, but probably should be documented better by the chat interface.
The last new feature is the ability to generate unit tests. Unfortunately this is limited to Java and Python and I didn’t have a code base of sufficient size in these languages to test this myself. However, generating unit tests is one of the most useful features of code generation solutions and based on the documentation it seems to do this well.
Together these 3 new features are a welcome addition to Q Developer and should make it easier to do the heavy lifting for your applications or infrastructure.
Bedrock Multi-Agent Collaboration
Multi-Agent collaboration is a new feature for Bedrock that allows you to run multiple agents at the same time. This means that instead of using a single agent that needs to be able to do everything, you can instead have a number of subagents that are configured specifically for certain tasks and should therefore be more attuned to the output you want from them. On top of this is a supervisor agent that handles querying these agents, and possibly even asking follow-up questions. The outcomes of these subagents are then combined and returned. In addition, the subagents themselves can have subagents as well, with a soft limit of 3 layers.
Currently this feature is in preview, but it’s available everywhere that Bedrock Agents are, which includes our own Sydney region.
Amazon Nova models
While AWS offers a lot of third-party models for use in Bedrock, the new Nova foundational models are designed by AWS itself. There are currently 3 versions of this, each aimed at different use cases, with a fourth slated for release early next year.
Nova Micro is the lowest cost model and aimed at text-based tasks like text summarisation, translation, and simple mathematical reasoning and coding. Nova Lite and Nova Pro are multi-modal models that can process text, images, and video. Nova Premium, which is an even more capable version of this is expected in early 2025.
Aside from the models however, it also includes two content generation models. One for generating images, and one for generating videos. The demo video in the announcement looks quite good, but it will need some more actual use to verify how well it works.
As of right now however, all of these models are only available in the USA. Hopefully they’ll come be available soon here in the Sydney region.
Aurora DSQL
I’ve added one announcement that doesn’t involve AI/ML, because it seems extremely intresting even though it’s still only in preview. Aurora DSQL (which stands for Distributed SQL) is another flavour of Aurora. We already have Aurora Serverless and Aurora Limitless, but DSQL is aimed at offering multi-region functionalities by being a true distributed database.
This means that it will have write access in each region it’s deployed and all write actions are synced globally as quickly as they can. To achieve this, Aurora DSQL only checks each transaction at commit time and, on commit, parallelises all the writes across all regions. In order to keep the order of events intact it utilises Amazon Time Sync, which is the the NTP service that AWS designed and that late last year gained microsecond accuracy.
Unfortunately at this time Aurora DSQL is still only in preview and only available in the Ohio region, but during this period that does mean it’s free to use and experiment with. With the obvious caveat that it’s only available in the single region so it’s not really possible to test that functionality today.