-
Notifications
You must be signed in to change notification settings - Fork 105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Relationship between channels C and bbp #5
Comments
If you read the original paper (https://arxiv.org/pdf/1804.02958.pdf), the upper bound on the bitrate is given by Eq. 5. Here dim ( |
I want to test the performance of this Model, so i modify single_plot function like this and then run your compress.py, there are two steps:
If i want to realize End-to-End image compression, i think i should save the quantized representation as file at sender side , and recover file to reconstructed image at receiver side, sender and receiver both should load the well-trained model, is my thinking right?
|
I test your pre-trained mode, the test data: the test result is different from mine, because my input image size is 256x256 I also test your model effect using different images:
can this model can be used for compressing arbitrary images? |
If you want to compress arbitrary images, train on a large dataset of natural images like ImageNet or the ADE20k dataset. The pretrained model was only trained on the Cityscapes dataset, which is a collection of street scenes from Germany and Switzerland. The distribution of images in ImageNet/ADE20k will be more diverse and so the model will probably take longer to converge. To train on ADE20k download the dataset from the link in the readme and pass the python3 train.py -ds ADE20k <args> To train on ImageNet you will have to write your own data loader. I think it will work with the default setup, but you will have to check this. |
First I train my model using cityscapes 60 epochs, and then continue to train this model using ADE20k 10 epochs, i find the compress effect become wrose.Maybe the model doesn't converge. I think it is hard to compress arbitrary image using one model. |
Don't train using Cityscapes initially, just train using ADE20k. Make sure you pull the latest version, I fixed a couple of errors in the code. It should take a long time for the model to converge using ADE20k, the authors trained it for 50 epochs originally to get the results in the paper. |
OK, this morning i also read the paper, i find i should train ADE20k from ZERO, but one error occured
|
I modify the network as you do, but there still have problem with Incompatible shapes: [1,3,688,512] vs. [1,3,683,512] in Line 127 (model.py): distortion_penalty = config.lambda_X * tf.losses.mean_squared_error(self.example, self.reconstruction). Do you have any suggest? |
the shape of self.example and self.reconstruction should be the same, for cityscapes dataset it should be |
I use the dataset of ADE20K which only rescale the width to 512px, is there any change to others parameters except for disabling the sample noise? @chenxianghu |
I modify many places: I think it is better that you learn some basic knowledge first and then try to train your own model! |
@chenxianghu First, thank you very much for your reply, then I still have a question: 200x200 to 975x975 means the images larger or lower than this will be excluded? And then the dataset contains less than 20210 training images, right? |
yes, this is the description of the original paper: Data sets: We train the proposed method on two popular data sets that come |
I know this, I just puzzles that does this sentence ( 20 210 training and 2000 validation images of a wide variety of sizes (200�200px to 975�975px)) means the 20210 training images' size vary from 200x200 to 975x975? |
i checked some jpeg image's size are not in the range from 200x200 to 975x975 |
Yes, so I am puzzled... Okay, I know this, the dataset of training is smaller than 20210. Thank you~ |
@chenxianghu hi, do you add nosie while training the ADE20K, I came across an error result from the mismatch of the noise's dimension and the encoder network's output. So I wonder if we have to change the method of generating noise. What's more, whether your result is acceptable based on the ADE20K dataset, mine is quite poor and the generator is not convergent after almost 40 epoches. |
@Jillian2017 I add nosie while training the ADE20K dataset by modifying Network.dcgan_generator function to adapt 512x512, my generated images quality is also poor after 40 epoches, some generated images even have a strange colorized plaque which doesn't exist in the original images. Do you have this case , I don't know why. |
@chenxianghu Hi, can you leave a email ? I would like to ask you some questions. |
you said C=8 channels - (compression to 0.072 bbp) in Results section
I don't know the relationship between C and bbp, can you explain to me?
We compare this image compression effect with BPG
png image -> decode -> our encode -> our quantize Result: quantized representation
the bbp comparation objects are quantized representation and encoded BPG,
same bbp which decoded image quality is better or same decoded image quality which bbp is lower, right?
The text was updated successfully, but these errors were encountered: