There are several issues here:
(1) Using your DAW to 'play' your rack by sending it v/oct, gate and other CV signals
(2) Capturing the audio from your rack and feeding it into your DAW
(3) Getting the result to your speaker.
(4) Sync

A common approach for (1) would be to use a MIDI-to-CV converter (Mutable Yarns, Expert Sleepers FH-1/2, many others) to take MIDI signals from your DAW and give you v/oct, gate and mod signals. In fact, your Keystep will already do this - feed its v/oct, gate and mod outputs to your rack (probably what you're doing already) hook it to your Mac by USB create an external MIDI track in your DAW, enter a few notes on the MIDI editor and bingo.

The common approach for (2) would be to feed the output from either your mixer or your rack to a regular (non-Eurorack) USB or Thunderbolt audio interface box. You can then add an audio track in your DAW and it will get its input from the rack. Most audio interfaces will also drive your active speakers and have a 'mix' knob so you can fade between the output of your DAW and the 'input' from your rack. Its your choice whether you get an audio interface with lots of inputs and use your DAW as a mixer, feed the stereo from your mixer into the audio interface... or feed the output from the DAW into your mixer. Depends what you're doing.

The ES-8 and ES-9 are, fundamentally, Eurorack-format audio interfaces. They do integrate nicely into your rack, and transfer multi-channel audio to and from your DAW. The ES-9 even has an extra pair of outputs that will drive your speakers or mixer. However, the main reason for choosing them over a regular (and maybe cheaper) non-Euro audio interface is that they can also carry CV, V/Oct, gate, mod etc. to and from your DAW. Regular audio interfaces are (usually) 'A/C coupled' which means that they'll filter out anything below audio frequency - so if you try and pass control voltages they'll get mangled. ES8/9 are 'DC coupled' which means that they can carry slowly-changing control voltages as well as audio. This lets you do cool things like using 'modular' software synths like VCV Rack or Reaktor as an extension to your eurorack system and enjoy infinite free modules. You can also use them as an arguably superior alternative to MIDI for controlling your rack from the DAW, but you'll need a plug-in like Expert Sleepers' 'silent way' for Logic/Reaper or Ableton's CV Tools (I think Bitwig has built-in support - may be worth checking out if you're shopping for DAWs) but it's more hassle than just using MIDI-to-CV.

As for sync - if you're using the KeyStep, there are mini-switches on the back that let you select the 'sync' source - if you set that to USB, tell your DAW to send MIDI clock and start/stop data to the KeyStep then the sequencer and arp should be nicely synced to your DAW. You've also got a sync jack on the back which you can feed to sequencers and stuff (clocked delays etc.) on the Eurorack. I guess the BeatStep will have similar settings.

I wouldn't rule out the ES-9 but it might be easier to start experimenting with the Keystep and a regular audio interface (if you don't already have one, Behringer do some cheap'n'cheerful ones like the UMC404 that won't hurt too much if you change your mind ).

So what's the minimum setup that I need to use just Clouds?
-- soundmodel

Have you tried the free "virtual" version of Clouds in VCV rack (a free Eurorack simulator)?

The "Audible Instruments -> Texture Synth" module is an implementation of Clouds (legit, since a lot of the MI stuff is open-source).

You can then use the "VCV Bridge" virtual effect VST/AU plug-in to route the output of your DAW through your virtual Eurorack. Or, if your sound source doesn't support AU/VST use a virtual audio device like Soundflower or Loopback (on Mac - I'm sure there are PC & Linux equivalents) to route your audio via your virtual rack.

NB: remember, the first hit is always free... you'll soon succumb to that creeping feeling that using virtual modular gear doesn't quite hit the spot and craving real knobs and patch cables...