5

I am trying to make images from our company Maven based applications. These apps are using some libraries which are included in a Maven repository, but fills a lot of disk space. They also have some dependencies which should be downloaded. I have seen this question which is talking about downloading dependencies (and unfortunately has no approved answer), but for me it is more important to use my local repository without adding it to the image (which increases image size from 300M to 4G) and caching downloading dependencies is in the second place.

If I wanted to ADD my local repository to the image, I would have make this Dockerfile:

FROM java:8.0_31
MAINTAINER Zeinab Abbasimazar (zeinab.abbasi@peykasa.ir)
ADD .m2/ $HOME
WORKDIR .
RUN apt-get update;apt-get install -y maven;cd Development/;mvn clean install -P test
CMD ["./Component.sh &"]

Which makes this image:

sudo docker images
REPOSITORY                                   TAG                 IMAGE ID            CREATED             SIZE
maven                                        3.3.9               44ab79a0bd9e        2 hours ago         4.678 GB

This size for an image is not affordable for me; could anyone please help me on this? Is there any workaround using gitlab or Jenkins to play Maven Repository role?

2 Answers2

2

Docker images are after all VM templates, i.e. they have to be more or less self-contained: you get the image, you run the environment with all the dependencies.

To approach the challenge operationally, Docker supports reuse of environments through the FROM statement i.e. you could maintain a base image and on top a much smaller image with the app itself.

But, from the strategic point of view, I would say your architects might want consider splitting the whole thing into smaller microservices - that would allow for 5 smaller images. Of course I cannot judge whether it's really possible.

Dan Cornilescu
  • 6,730
  • 2
  • 19
  • 44
Ta Mu
  • 6,772
  • 5
  • 39
  • 82
  • 2
    Docker images aren't "VM templates" as containerisation isn't the same as VM. – Hauleth Aug 17 '17 at 11:46
  • I yet have to search for the right term, but indeed with a Docker container you can simulate a host, therefore I consider containers to be lightweight VMs (in terms of Turing machine but also with a virtual network adapter and mountable virtual storage) – Ta Mu Aug 17 '17 at 12:57
  • 2
    Turing machine has nothing to do with either Docker or VMs. Virtual Machine simulate (nomen omen) machine, while containers (like Docker) only restraint user space. The difference is crucial. And if something Docker wasn't meant as "lightweight VMs", this is role of LXC. – Hauleth Aug 17 '17 at 13:01
  • then I need some better and correct wording to explain mgmt why and how Docker could be better than VMWare.. worth of another SE question :-) – Ta Mu Aug 17 '17 at 15:07
  • you can think about Docker containers as a fat JARs for something other than Java. So you pack all your dependencies and provide single file that can be ran as an application (from end user viewpoint). LXC and VM are rather system-like environments where you run your init system and all the things that are normally ran during startup. – Hauleth Aug 19 '17 at 11:15
  • what if I need to have a system-like environment in Docker? i.e. many middleware services – Ta Mu Aug 21 '17 at 14:05
  • Then use either multiple Docker containers or LXC. – Hauleth Aug 21 '17 at 14:06
  • which effectively replaces set of VMs. So I can't understand why I can't tell to business, "let's deploy instead of that MySQL VM just a MySQL container" => clever guys use VMs as service containers as well – Ta Mu Aug 21 '17 at 14:20
  • You should never run DB in container. – Hauleth Aug 21 '17 at 14:23
  • Integration environment neither - just for CI testing? – Ta Mu Aug 21 '17 at 15:45
  • In that case you shouldn't care. But in CI do you run multiple things in container? – Hauleth Aug 21 '17 at 16:02
  • @Hauleth docker being more or less lxc with aufs (non persistent file system), are you really speaking about LinuXContainer or something else? I fully agree with the difference between vm, VMware, xen, kvm or else doing hardware emulation against containers doing process isolation but I've a hard time to relate that to lxc) – Tensibai Nov 20 '17 at 22:14
  • @PeterMuryshkin à mysql server is not just the mysql service when you want to run it out of toy projects. To run properly in production, you need to give mysql access to low level io for caching, let it swap properly indexes and a bunch of other things suffering a lot from namespace isolation in a container. Added to the complexity of running maintenance jobs (hourly/weekly) which need to work in sync between the mysql server and filesystem. Containers are not lightweight VMs – Tensibai Nov 20 '17 at 22:18
  • BTW the point of Uber jar stands and is a proper comparison, I think I get the confusion between Java vm and hardware vm from this point of view. – Tensibai Nov 20 '17 at 22:20
0

I'm thinking about different approach to your problem. Instead of adding maven repository into the image, I will mount a volume to $HOME/.m2/repository, and another volume to $PROJECT_DIR

ENV PROJECT_DIR=/project
VOLUME /repository
VOLUME ${PROJECT_DIR}
RUN mkdir -p $HOME/.m2 /repository && ln -s /repository $HOME/.m2/repository

After that, add a condition check in component.sh whether this is the first run and then build the required component.

...
if [[ ! -e $PROJECT_DIR/app_is_ready ]]; then
  cd $PROJECT_DIR
  mvn clean install -P test
  touch $PROJECT_DIR/app_is_ready
fi
<<actual run commands>>

Please be note that the example code is not handling the race condition of two (or more) instances using the same volume for PROJECT_DIR started at the same time.

Hope this help!