Relationship between channels C and bbp #5

chenxianghu · 2018-05-26T10:02:54Z

you said C=8 channels - (compression to 0.072 bbp) in Results section
I don't know the relationship between C and bbp, can you explain to me?

We compare this image compression effect with BPG
png image -> decode -> our encode -> our quantize Result: quantized representation

the bbp comparation objects are quantized representation and encoded BPG,
same bbp which decoded image quality is better or same decoded image quality which bbp is lower, right?

Justin-Tan · 2018-05-27T11:04:12Z

If you read the original paper (https://arxiv.org/pdf/1804.02958.pdf), the upper bound on the bitrate is given by Eq. 5. Here dim (w_hat) is given by the number of channels C.

chenxianghu · 2018-05-29T01:46:48Z

I want to test the performance of this Model, so i modify single_plot function like this and then run your compress.py， there are two steps:

original image -> quantized representation ----spend 623ms
quantized representation -> reconstructed image --- spend 115 ms
is my test method right? and what's the performance measured data at your size?

If i want to realize End-to-End image compression, i think i should save the quantized representation as file at sender side , and recover file to reconstructed image at receiver side, sender and receiver both should load the well-trained model, is my thinking right?

def single_plot(epoch, global_step, sess, model, handle, name, config, single_compress=False):

    real = model.example
    gen = model.reconstruction
    zz = model.z
    start = time.time()
    # Generate images from noise, using the generator network.
    #r, g = sess.run([real, gen], feed_dict={model.training_phase:True, model.handle: handle})
    r,**z** = sess.run([real,zz], feed_dict={model.training_phase: True, **model.handle: handle**})
    print("encoder + quantizer spend  Time: {:.3f} s".format(time.time() - start))
    print('z shape:', z.shape)
    #print('z result:',z)
    start = time.time()
    **g** = sess.run(gen, feed_dict={model.training_phase: True, **model.z: z**})
    print("generator spend  Time: {:.3f} s".format(time.time() - start))

chenxianghu · 2018-05-29T09:45:04Z

I test your pre-trained mode, the test data:
1.original image -> quantized representation ----about 1.5s
2.quantized representation -> reconstructed image --- about 1s

the test result is different from mine, because my input image size is 256x256

I also test your model effect using different images:

image from leftImg8bit/train ----effect is good
image from leftImg8bit/test ----effect is worse than image from train dir
image from internet ---effect is terrible

can this model can be used for compressing arbitrary images?

Justin-Tan · 2018-05-29T10:09:58Z

If you want to compress arbitrary images, train on a large dataset of natural images like ImageNet or the ADE20k dataset. The pretrained model was only trained on the Cityscapes dataset, which is a collection of street scenes from Germany and Switzerland.

The distribution of images in ImageNet/ADE20k will be more diverse and so the model will probably take longer to converge. To train on ADE20k download the dataset from the link in the readme and pass the --ds ADE20k flag:

python3 train.py -ds ADE20k <args>

To train on ImageNet you will have to write your own data loader. I think it will work with the default setup, but you will have to check this.

chenxianghu · 2018-06-04T03:22:31Z

First I train my model using cityscapes 60 epochs, and then continue to train this model using ADE20k 10 epochs, i find the compress effect become wrose.Maybe the model doesn't converge. I think it is hard to compress arbitrary image using one model.

Justin-Tan · 2018-06-04T09:09:05Z

Don't train using Cityscapes initially, just train using ADE20k. Make sure you pull the latest version, I fixed a couple of errors in the code.

It should take a long time for the model to converge using ADE20k, the authors trained it for 50 epochs originally to get the results in the paper.

chenxianghu · 2018-06-04T09:26:42Z

OK, this morning i also read the paper, i find i should train ADE20k from ZERO, but one error occured
it seems that the shape of self.w_hat and Gv didn't match, so i disable sampling noise by adding a condition like below, now it is working well and under training. Thank you!

        if config.sample_noise is True and dataset != 'ADE20k':
            print('Sampling noise...')
            # noise_prior = tf.contrib.distributions.Uniform(-1., 1.)
            # self.noise_sample = noise_prior.sample([tf.shape(self.example)[0], config.noise_dim])
            noise_prior = tf.contrib.distributions.MultivariateNormalDiag(loc=tf.zeros([config.noise_dim]), scale_diag=tf.ones([config.noise_dim]))
            v = noise_prior.sample(tf.shape(self.example)[0])
            Gv = Network.dcgan_generator(v, config, self.training_phase, C=config.channel_bottleneck, upsample_dim=config.upsample_dim)
            print('Gv:', Gv);
            self.z = tf.concat([self.w_hat, Gv], axis=-1)
        else:
            self.z = self.w_hat

wensihan · 2018-06-12T03:10:44Z

I modify the network as you do, but there still have problem with Incompatible shapes: [1,3,688,512] vs. [1,3,683,512] in Line 127 (model.py): distortion_penalty = config.lambda_X * tf.losses.mean_squared_error(self.example, self.reconstruction). Do you have any suggest?

wensihan · 2018-06-12T03:18:53Z

@chenxianghu

chenxianghu · 2018-06-12T03:30:34Z

the shape of self.example and self.reconstruction should be the same, for cityscapes dataset it should be
[1, 512, 1024, 3]， which means [batch_size, height, width, channels]

wensihan · 2018-06-12T04:38:46Z

I use the dataset of ADE20K which only rescale the width to 512px, is there any change to others parameters except for disabling the sample noise? @chenxianghu

chenxianghu · 2018-06-12T05:40:09Z

I modify many places:
1)make my own h5 file, only use 200x200 to 975x975 jpeg images in ADE20K(as the same in the paper)
2)resize image to [512,512], not padding or cropping
3)use tf.image.decode_jpeg, not tf.image.decode_png
4)modify Network.dcgan_generator for adapting to [512,512]

I think it is better that you learn some basic knowledge first and then try to train your own model!

wensihan · 2018-06-12T06:13:20Z

@chenxianghu First, thank you very much for your reply, then I still have a question: 200x200 to 975x975 means the images larger or lower than this will be excluded? And then the dataset contains less than 20210 training images, right?

chenxianghu · 2018-06-12T06:19:36Z

yes, this is the description of the original paper:

Data sets: We train the proposed method on two popular data sets that come
with hand-annotated semantic label maps, namely Cityscapes [42] and ADE20k
[43]. Both of these data sets were previously used with GANs [12, 33], hence
we know that GANs can model their distribution|at least to a certain extent.
Cityscapes contains 2975 training and 500 validation images of dimension 2048�
1024px, which we resampled to 1024 � 512px for our experiments. The training
and validation images are annotated with 34 and 19 classes, respectively. From
the ADE20k data set we use the SceneParse150 subset with 20 210 training and
2000 validation images of a wide variety of sizes (200�200px to 975�975px), each
annotated with 150 classes. During training, the ADE20k images are rescaled
such that the width is 512px.

wensihan · 2018-06-12T06:24:29Z

I know this, I just puzzles that does this sentence ( 20 210 training and 2000 validation images of a wide variety of sizes (200�200px to 975�975px)) means the 20210 training images' size vary from 200x200 to 975x975?

chenxianghu · 2018-06-12T06:29:09Z

i checked some jpeg image's size are not in the range from 200x200 to 975x975
such as ADE20K\images\training\h\hacienda\ADE_train_00008829.jpg is 1024x768

wensihan · 2018-06-12T06:31:03Z

Yes, so I am puzzled... Okay, I know this, the dataset of training is smaller than 20210. Thank you~

Jillian2017 · 2018-06-15T07:28:37Z

@chenxianghu hi, do you add nosie while training the ADE20K, I came across an error result from the mismatch of the noise's dimension and the encoder network's output. So I wonder if we have to change the method of generating noise. What's more, whether your result is acceptable based on the ADE20K dataset, mine is quite poor and the generator is not convergent after almost 40 epoches.

chenxianghu · 2018-06-19T02:18:24Z

@Jillian2017 I add nosie while training the ADE20K dataset by modifying Network.dcgan_generator function to adapt 512x512, my generated images quality is also poor after 40 epoches, some generated images even have a strange colorized plaque which doesn't exist in the original images. Do you have this case , I don't know why.

zhiqiang-zhu · 2018-07-12T02:21:43Z

@chenxianghu Hi, can you leave a email ? I would like to ask you some questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Relationship between channels C and bbp #5

Relationship between channels C and bbp #5

chenxianghu commented May 26, 2018

Justin-Tan commented May 27, 2018

chenxianghu commented May 29, 2018

chenxianghu commented May 29, 2018 •

edited

Loading

Justin-Tan commented May 29, 2018 •

edited

Loading

chenxianghu commented Jun 4, 2018

Justin-Tan commented Jun 4, 2018

chenxianghu commented Jun 4, 2018

wensihan commented Jun 12, 2018 •

edited

Loading

wensihan commented Jun 12, 2018

chenxianghu commented Jun 12, 2018

wensihan commented Jun 12, 2018 •

edited

Loading

chenxianghu commented Jun 12, 2018

wensihan commented Jun 12, 2018 •

edited

Loading

chenxianghu commented Jun 12, 2018

wensihan commented Jun 12, 2018 •

edited

Loading

chenxianghu commented Jun 12, 2018

wensihan commented Jun 12, 2018

Jillian2017 commented Jun 15, 2018

chenxianghu commented Jun 19, 2018

zhiqiang-zhu commented Jul 12, 2018

Relationship between channels C and bbp #5

Relationship between channels C and bbp #5

Comments

chenxianghu commented May 26, 2018

Justin-Tan commented May 27, 2018

chenxianghu commented May 29, 2018

chenxianghu commented May 29, 2018 • edited Loading

Justin-Tan commented May 29, 2018 • edited Loading

chenxianghu commented Jun 4, 2018

Justin-Tan commented Jun 4, 2018

chenxianghu commented Jun 4, 2018

wensihan commented Jun 12, 2018 • edited Loading

wensihan commented Jun 12, 2018

chenxianghu commented Jun 12, 2018

wensihan commented Jun 12, 2018 • edited Loading

chenxianghu commented Jun 12, 2018

wensihan commented Jun 12, 2018 • edited Loading

chenxianghu commented Jun 12, 2018

wensihan commented Jun 12, 2018 • edited Loading

chenxianghu commented Jun 12, 2018

wensihan commented Jun 12, 2018

Jillian2017 commented Jun 15, 2018

chenxianghu commented Jun 19, 2018

zhiqiang-zhu commented Jul 12, 2018

chenxianghu commented May 29, 2018 •

edited

Loading

Justin-Tan commented May 29, 2018 •

edited

Loading

wensihan commented Jun 12, 2018 •

edited

Loading

wensihan commented Jun 12, 2018 •

edited

Loading

wensihan commented Jun 12, 2018 •

edited

Loading

wensihan commented Jun 12, 2018 •

edited

Loading