Your questions answered shortly
What's the 1090 ES?
SSR protocol is used to manage transmissions on SSR frequencies (1030 and 1090 MHz). Mode S is a part of this protocol to manage data exchange between interrogator and transponder. 1090ES is a part of mode S describing how a transponder can transmit long unsolicited messages (extended squitters) which are used to carry ADS-B message (DF17/DF18).

Layers involved in ATCRBS and ADS-B transmission, source
If a aircraft uses 1090 ES as their ADS-B out, Does aircraft must equip with MODE S?
Yes, 1090ES is a user of mode S. But ADS-B is not required to use a transponder to broadcast the ADS-B message, it can use UAT (or VDL) as well.
Does every MODE S have 1090ES functions?
No, Some older Mode-S transponders do not support 1090ES, they cannot send long messages required to broadcast ADS data.
The 'S' of Mode S is 'selective', but it doesn't match with ADS-B that broadcast without interrogation signal.
Selective interrogation is used by the transponder to filter incoming interrogations. ADS-B is an 'broadcast' service, messages are send automatically using the transponder (so the name 'squitter'). From a transponder standpoint, messages are not replies to interrogations, even if they use 1090 MHz, the downlink frequency to reply and to broadcast.
More details follow if you're interested.
SSR/Mode S and ADS-B technologies, like PBN/RNAV, are not well treated in technical literature due to the numerous changes during their infancy. The result is the wording is often misused and the big picture is blurry, only unrelated details are discussed without providing the whole context.
The key point is to identify clearly, like in OSI networking model, the different functional layers and data formats used. ADS-B and SSR Mode S are not on par, ADS-B is a higher level service using the lower transmission capabilities of mode S.
And indeed ADS-B has nothing to do with SSR reply to interrogation also found in Mode S, but driven by a different higher service.
Mode S concept
Mode S specifications include two aspects:
The possibly to identify an aircraft using SSR: This part is compatible with modes A/C. Mode A and C transmissions have simple formats with fixed information sent embedded in 12 SSR pulses.
The possibility to exchange data: Mode S is bidirectional, it defines data frames to transmit or receive data with a varying format and content. This is actually a rudimentary network protocol. A mode S transponder is able to receive and send these varying formats. ADS-B messages are sent using this data transfer capability.
1090ES out is a layer within the transponder which transmits only. 1090ES is capable of assembling a long message (112 bits). In this scheme ADS-B logic provide the data to be transmitted. The way transponders work is to use 'registers'. When it's time to send a mode S frame, the transponder use the data available in the registers to assemble the frame, then sends it. So ADS-B store data to be transmitted in the transponder registers, and 1090ES layer assembles the frame, known as 'extended squitter'. 'Squitter' because the transponder sends it without prior interrogation (broadcast), 'extended' because its length is 112 bits instead of 56.
What is involved when a ADS-B transmission occurs
Data preparation:
- DF17 (or DF18 when the message is not to be sent by a transponder), the format of a message carrying ADS-B data
- ADS-B logic feeds the mode S transponder registers relevant to ADS-B with data.
Data transmission, if using SSR:
- 1090ES allowing long (112 bit) mode S messages (extended squitters)
- 1090ES compatible mode S transponder to send DF17 messages on SSR frequency using the extended squitter feature and data in the registers.

ADS-B message generation
SSR interrogation
An ATCRBS interrogator on the ground can interrogate all transponders (A/C/S) using multiple combinations to select either only mode A/C, or S, or a combination. It can also elicit replies from all mode S transponders, or from a single one. This is this selective interrogation function you refer to in your question.
Mode S reply versus mode S broadcast
Responding to ATCRBS is usually done in mode S by the same logic which is at work in mode A and C, except an additional 56 bit data frame is added.
In addition mode S transponder have the possibility to broadcast frames without prior solicitation, this is the squitter mode.
Basic data level: Mode S data format
Mode S is based on a 56 bit data frame which can be extended to 112 bits. Generally what can be inserted in this frame is defined in ICAO Annex 10, there are 25 different frame content (formats) used for uplink (UF formats for reception by the transponder) and 25 formats for downlink (DF formats for transmission by the transponder).
Among these downlink formats there are two important ones: DF11 is used to reply to 'all call' interrogation (equivalent to replies in mode A and C), and DF17 which includes additional information, among them the position, is used for ADS-B out.
Short vs long data frames
A mode S transponder can broadcast two types of unsolicited messages: Short (56 bits) and long (112 bits). Initial versions of mode S transponders were designed to manage messages short messages. This was compatible with the DF11 message used as acquisition squitter for TCAS and to reply to Mode-S acquisition interrogations (UF11).

Mode S downlink formats currently in use
The long format, usually referred to as the extended squitter is the one used to squitter DF17 message used by ADS-B.
1090ES refers to a long message transmitted on 1090 MHz, the frequency used by mode S to (reply and) broadcast.
This name 'squitter' comes from the DME transponders. The ground station used in DME is a transponder which replies to aircraft interrogators. In order to transmit something without interrogation, and allow aircraft to synchronize on the DME station, these transponders were sending random pulses called squitters. Mode-S reused this terminology to refer to any unsolicited transmission.
One level up: The ADS-B service
ADS-B is a user of the transponder. How ADS-B is materialized is actually undefined, but it's usually a piece of software in the transponder box, or a separate box. It builds DF17 messages and asks the Mode-S transponder to sent them using the Mode-S protocol.
The chain is somehow similar to using a modem to communicate:
- ADS-B is the application creating DF17 messages to be sent like a web browser creates HTTP messages.
- DF17 messages are encapsulated in squitters, like HTTP messages are encapsulated in IP packets.
- Squitters are transmitted by a mode S transponder like IP packets are transmitted using a modem.