My favorites | Sign in
Google Code will be turning read-only on August 25th. See this post for more information.
Google Code will be read-only on Monday 10 August 2015 at 13:00 UTC for up to thirty minutes for network maintenance.
Project Home Downloads Wiki Issues Code Search
New issue   Search
for
  Advanced search   Search tips   Subscriptions
Issue 32771: html file not saved; _files folder created
21 people starred this issue and may be notified of changes. Back to list
 
Reported by RaySte...@gmail.com, Jan 20, 2010
Chrome Version       : 4.0.249.30 (Official Build 33928)
URLs (if applicable)
:http://www.startribune.com/local/east/82213892.html?elr=KArksLckD8EQDUUUnciaec8O7EyUsl
Other browsers tested:
Add OK or FAIL after other browsers where you have tested this issue:
Safari 4:
  Firefox 3.x: 3.0.15
IE 7:
IE 8:

What steps will reproduce the problem?
1.Go to the URL
2.Right-click save page
3.(optional?) Type page number in after save name after title shown on web page
4. Save it

What is the expected result? saved html file and ..._files folder.  


What happens instead? html page is NOT saved but ..._files folder is
created.  Folder contains: 91 files with Chrome browser; 3 subfolders and
75 files with Firefox. 


Please provide any additional information below. Attach a screenshot if
possible.

Folder name for the _files folder contained some spaces and a %20 after the
added page number.  Running under Ubuntu with 8 open Chrome windows; about
43 tabs (a count in task manager would help).

Jan 22, 2010
#1 phajdan.jr@chromium.org
(No comment was entered for this change.)
Labels: -Area-Undefined Area-Feature Feature-SavePage
Feb 16, 2010
#2 lafo...@chromium.org
(No comment was entered for this change.)
Labels: -Area-Feature Area-UI
Apr 12, 2011
#3 rdsmith@chromium.org
(No comment was entered for this change.)
Labels: Feature-Downloads
Feb 9, 2012
#4 cbentzel@chromium.org
That page doesn't save on 19.0.1035.0 Win - even without replacing the suggested filename. The subresources do save as mentioned.
Status: Available
Jun 21, 2012
#5 rdsmith@chromium.org
(No comment was entered for this change.)
Blocking: -68358 chromium:68358 chromium:68358
Jul 2, 2012
#6 rsesek@chromium.org
 Issue 129973  has been merged into this issue.
Cc: benjhayden@chromium.org rdsmith@chromium.org
Jul 2, 2012
#7 rsesek@chromium.org
 Issue 86282  has been merged into this issue.
Jul 2, 2012
#8 rsesek@chromium.org
(No comment was entered for this change.)
Labels: OS-All
Jul 3, 2012
#9 muir_b...@inbox.com
[ "Save Page As" dialog results in folder w/o page ("Removed" status in footer) ]

The above is the subject line from the formerly active  issue #129973  (merged into this one per "rsesek" in the sixth [6th] comment). This comment is intended as an aid to body searches and to acknowledge that issue #32771 is now the rally point.

URLs from  issue #129973 : 
1) http://www.sfgate.com/cgi-bin/article.cgi?f=/c/a/2012/05/25/BAJT1OMSB4.DTL (p+g)
http://www.sfgate.com/bayarea/place/article/Golden-Gate-Bridge-s-plaza-flawed-but-workable-3585446.php
2) http://www.sfgate.com/cgi-bin/article.cgi?f=/c/a/2012/05/13/MN741OF5BC.DTL (g)
http://www.sfgate.com/bayarea/place/article/Golden-Gate-Bridge-construction-and-indignation-3554707.php
3) http://www.slate.com/articles/life/walking/2012/04/walking_in_america_how_we_can_become_pedestrians_once_more_.html
4) http://www.sfgate.com/cgi-bin/article.cgi?f=/c/a/2012/05/19/BAPT1OBTTQ.DTL (g)
http://www.sfgate.com/bayarea/article/Golden-Gate-Bridge-monument-work-of-art-star-3571131.php
Note - The SFGate.com URLs have galleries that also glitch when saving (p=page; g=gallery). The site went thru a reorg. recently so I'm also listing the current URLs to avoid confusion.
Sep 4, 2012
#10 rdsmith@chromium.org
 Issue 145965  has been merged into this issue.
Sep 5, 2012
#11 kel...@gmail.com
This bug is almost 3 years old (may be 4 years, since Chrome 1.0, but nobody reported it).

It would be a nice surprise for Chrome's 4th anniversary if this bug would be fixed in upcoming Chrome 22.0 Stable.

See my reported issues here, and view some net-internals logs : https://code.google.com/p/chromium/issues/detail?id=145965.

If the devs need any help to track the cause of this bug, please let me know.
Sep 5, 2012
#12 rdsmith@chromium.org
The problem is that the code this bug is in is complex and messy and we're planning to get rid of it in the next few months (incorporate it in the main download code).  So I do expect that this bug will be fixed by the end of the year as part of ongoing refactoring efforts, just by a junk-and-rewrite rather than digging into the current code. 

Sep 5, 2012
#13 kel...@gmail.com
That's great, but until then, I suggest reviwing the current code and find what's stoping the *.html file to be created (e.g. may be a script which is not properly loaded). Maybe the file saving can be forced regardless of the errors it finds in the loaded page.
Oct 29, 2012
#14 frantise...@gmail.com
Support the issue: I experience the same. Only the "_files" dir created (and filled), but the .htm file not created. ...cca in 1/3 of my cases! But still I did not deduct the reason. I suspected that the reason could be related to "too long file path" for the resulting files in 'the save', however the problem appears even when saving into the root dir of my dist (even to the C:\)

Note, that the issues are related to the "full save": In case of "HTML file only", the .htm file IS saved.
...but is that one the same as the .htm file of the "full page"? Hardly.
Oct 29, 2012
#15 frantise...@gmail.com
This issue is caused by the Chrome browser (Verze 22.0.1229.94 m = recent), and no any webapp. So other browsers are irrelevant.
Oct 29, 2012
#16 kel...@gmail.com
This bug happened since Chrome 4.0 and it's still not fixed in the latest Chrome 24.0.1305.3 Dev.

If it helps you, I've found a solution. Install the AdBlock extension, and all your pages will be saved. 

I guess that Chrome fails to save pages with a lot of ads. Other browsers work on every page.
Oct 29, 2012
#17 asanka@chromium.org
When I try to save http://www.sfgate.com/cgi-bin/article.cgi?f=/c/a/2012/05/25/BAJT1OMSB4.DTL on a debug build, I hit the DCHECK(!detached_) in BaseFile.

Looks like we schedule a SaveFileManager::UpdateSaveProgress() after a SaveFileManager::SaveFinished() for the same save_item.

Oct 29, 2012
#18 rdsmith@chromium.org
Huh.  Probably at least somewhat related to  issue 144751  (same DCHECK, I believe different pathway).  Send me the backtrace and I'll try to fix that one with 144751.

Do you think it's related to root cause on this bug?  In my issue, that dcheck only happens after we've completed saving the file.


Nov 1, 2012
#19 asanka@chromium.org
The stack for the PostTask for SaveFileManager::SaveFinished() call is:

1d5f55d8  10077fb1 base!base::debug::StackTrace::StackTrace+0x21 [d:\src\chrome\src\base\debug\stack_trace_win.cc @ 149]
1d5f55dc  08a47779 content!content::SavePackage::OnReceivedSerializedHtmlData+0xae9 [d:\src\chrome\src\content\browser\download\save_package.cc @ 1086]**
1d5f55e0  086f0454 content!DispatchToMethod<content::SavePackage,void (__thiscall content::SavePackage::*)(GURL const &,std::basic_string<char,std::char_traits<char>,std::allocator<char> > const &,int),GURL,std::basic_string<char,std::char_traits<char>,std::allocator<char> >,int>+0x24 [d:\src\chrome\src\base\tuple.h @ 559]
1d5f55e4  089a3702 content!ViewHostMsg_SendSerializedHtmlData::Dispatch<content::SavePackage,content::SavePackage,void (__thiscall content::SavePackage::*)(GURL const &,std::basic_string<char,std::char_traits<char>,std::allocator<char> > const &,int)>+0x82 [d:\src\chrome\src\content\common\view_messages.h @ 2288]
1d5f55e8  0898b8ba content!content::SavePackage::OnMessageReceived+0x19a [d:\src\chrome\src\content\browser\download\save_package.cc @ 953]
1d5f55ec  08e75aab content!content::WebContentsImpl::OnMessageReceived+0xeb [d:\src\chrome\src\content\browser\web_contents\web_contents_impl.cc @ 675]
1d5f55f0  08cef193 content!content::RenderViewHostImpl::OnMessageReceived+0x1f3 [d:\src\chrome\src\content\browser\renderer_host\render_view_host_impl.cc @ 947]
1d5f55f4  08cc7d5d content!content::RenderProcessHostImpl::OnMessageReceived+0x85d [d:\src\chrome\src\content\browser\renderer_host\render_process_host_impl.cc @ 1075]
[...]

** The location in content::SavePackage::OnReceivedSerializedHtmlData() is:

  // Current frame is completed saving, call finish in file thread.
  if (flag == WebPageSerializerClient::CurrentFrameIsFinished) {
    VLOG(20) << " " << __FUNCTION__ << "()"
             << " save_id = " << save_item->save_id()
             << " url = \"" << save_item->url().spec() << "\"";
    BrowserThread::PostTask(
        BrowserThread::FILE, FROM_HERE,
        base::Bind(&SaveFileManager::SaveFinished,
                   file_manager_,
                   save_item->save_id(),
                   save_item->url(),
                   id,
                   true)); <----
  }

And then, for teh same save_item, we do a PostTask for SaveFileManager::UpdateSaveProgress() with the following stack:

1d7269d8  10077fb1 base!base::debug::StackTrace::StackTrace+0x21 [d:\src\chrome\src\base\debug\stack_trace_win.cc @ 149]
1d7269dc  08a47473 content!content::SavePackage::OnReceivedSerializedHtmlData+0x7e3 [d:\src\chrome\src\content\browser\download\save_package.cc @ 1070]
1d7269e0  086f0454 content!DispatchToMethod<content::SavePackage,void (__thiscall content::SavePackage::*)(GURL const &,std::basic_string<char,std::char_traits<char>,std::allocator<char> > const &,int),GURL,std::basic_string<char,std::char_traits<char>,std::allocator<char> >,int>+0x24 [d:\src\chrome\src\base\tuple.h @ 559]
1d7269e4  089a3702 content!ViewHostMsg_SendSerializedHtmlData::Dispatch<content::SavePackage,content::SavePackage,void (__thiscall content::SavePackage::*)(GURL const &,std::basic_string<char,std::char_traits<char>,std::allocator<char> > const &,int)>+0x82 [d:\src\chrome\src\content\common\view_messages.h @ 2288]
1d7269e8  0898b8ba content!content::SavePackage::OnMessageReceived+0x19a [d:\src\chrome\src\content\browser\download\save_package.cc @ 953]
1d7269ec  08e75aab content!content::WebContentsImpl::OnMessageReceived+0xeb [d:\src\chrome\src\content\browser\web_contents\web_contents_impl.cc @ 675]
1d7269f0  08cef193 content!content::RenderViewHostImpl::OnMessageReceived+0x1f3 [d:\src\chrome\src\content\browser\renderer_host\render_view_host_impl.cc @ 947]
1d7269f4  08cc7d5d content!content::RenderProcessHostImpl::OnMessageReceived+0x85d [d:\src\chrome\src\content\browser\renderer_host\render_process_host_impl.cc @ 1075]
1d7269f8  0219d407 ipc!IPC::ChannelProxy::Context::OnDispatchMessage+0x157 [d:\src\chrome\src\ipc\ipc_channel_proxy.cc @ 261]
[...]

** The location in content::SavePackage::OnReceivedSerializedHtmlData() is:

  if (!data.empty()) {
    // Prepare buffer for saving HTML data.
    scoped_refptr<net::IOBuffer> new_data(new net::IOBuffer(data.size()));
    memcpy(new_data->data(), data.data(), data.size());

    // Call write file functionality in file thread.
    BrowserThread::PostTask(
        BrowserThread::FILE, FROM_HERE,
        base::Bind(&SaveFileManager::UpdateSaveProgress,
                   file_manager_,
                   save_item->save_id(),
                   new_data,
                   static_cast<int>(data.size()))); <----
  }

(Ignore the small line number offsets. I added some instrumentation to capture the stack traces.)

Nov 1, 2012
#20 rdsmith@chromium.org
Ah!!  Thanks.  Yes, that doesn't seem to be the same problem as the one I'm chasing.  I *do* think it's the same problem as issue 106364 (which I should update with some more information in case anyone wants to pick it up--we figured it out a while ago and then dropped it) where multiple iframes in a page had the same URL, and hence were being mapped to the same SaveItem (which is pretty much the only way that I can think of to get the backtraces you posted).  I'll update 106364.

Mar 10, 2013
#21 bugdroid1@chromium.org
(No comment was entered for this change.)
Labels: -Feature-SavePage -Area-UI -Feature-Downloads Cr-Content-SavePage Cr-UI Cr-UI-Browser-Downloads
Apr 5, 2013
#22 bugdroid1@chromium.org
(No comment was entered for this change.)
Labels: Cr-Blink
Apr 5, 2013
#23 bugdroid1@chromium.org
(No comment was entered for this change.)
Labels: -Cr-Content-SavePage Cr-Blink-SavePage
Apr 8, 2013
#24 agarc...@gmail.com
strange?????
Follow instructions @
http://www.ghacks.net/2011/05/26/single-file-save-websites-to-a-single-html-file-in-chrome/

I quote:
"Start with the installation of Single File and complete the process with the installation of Single File Core afterwards"

I still havent use the "save as single file" feature, but after installing these chrome 'add ons' , now I can do "save as Web Page, Complete"
Apr 30, 2013
#25 rdsmith@chromium.org
(No comment was entered for this change.)
Cc: -rdsmith@chromium.org
Jun 24, 2013
#28 jackdac...@gmail.com
in reply to #24:

unfortunately that's not a solution for everyone 

in certain cases the embedded pictures of a websites *HAVE* to be saved 

time & productivity constraints don't allow to use dubious extensions where for every site you have to click on 2 buttons, wait in between 

and/or save images individually 


if any dev is tracking this: 

a website where it reliably fails for one or every following site: 

http://medikamente.onmeda.de/Wirkstoffe/Cabergolin.html 



starting with "Überblick" 

then subsequently click on "Gegenanzeigen" -> save -> "Nebenwirkungen" -> save -> ... 

Überblick 
Gegenanzeigen 
Nebenwirkungen 
Wechselwirkungen 
Warnhinweise 
Medikamente 
Wirkung 


at the end I only had the folders of the supposedly saved pages with the files but NOT the html file


ymmv: the first time I had 1 html file, the following time (starting with clean folder) - NONE



in reply to #16:

using the adblock extension is a great temporary "fix" - with it the above website can be saved - thanks !


(sorry for the frequent re-posting will stop & observe the topic now)


Thanks 

*fingers crossed* that this issue can be tracked down & fixed for good
Jun 25, 2013
#29 asanka@chromium.org
 Issue 253921  has been merged into this issue.
Sep 26, 2013
#30 niche...@gmail.com
This issue is still UNRESOLVED in Version 29.0.1547.66 m
It is very dodgy since a page can appear to have been saved correctly, but when you click on the 'Down Arrow' and select 'Show in Folder' the Folder does not appear.
Usually if one closes the tab, before the page is completely saved, it will abort the save and report 'Cancelled'. The Folder will still be saved.
However the 'Removed' report does not show up until one tries another 'Save'.
Altogether this makes Chrome 'VERY DODGY' as one cannot rely upon it to save Vital Pages without constantly checking to see if the page has been saved.
This is a CRUCIAL ISSUE that requires IMMEDIATE ATTENTION!
The 'AdBlocker' Extension Workaround is not a Worthy Solution to this Problem!

Jan 17, 2014
#31 azondisc...@gmail.com
This issue is still unresolved and none of the available adblockers seem to have any impact on the matter, which occurs even on pages that do not have any ads anyway!
The only recourse is to use IE which does not have a problem saving html pages!
If you reload the IE-saved page into Chrome it re-saves without a problem!
I've tried disabling every Extension but it has no effect, indicating that someone needs to rewrite the 'Save as' Code from the ground up!
Occasionally if you refresh the page to be saved it 'seems' to help matters, but some pages are so stubborn that I have tried countless times and still they cannot be saved!
Mar 24, 2014
#32 pda...@opera.com
I plan to investigate this bug with a purpose of fixing it. Is there any ongoing task in progress in the involved code? If not I can start investigation, I suppose...
Mar 24, 2014
#34 frantise...@gmail.com
* "Occasionally if you refresh the page to be saved it 'seems' to help matters, but some pages are so stubborn that I have tried countless times and still they cannot be saved!" - yes, me too. The same. :(

Another idea I had is to scroll all down to hit the bottom of the page on the UI. I am not sure neither myself, whether this help whatever... Just one of my trials.

* "I plan to investigate this bug with a purpose of fixing it." - Oh yes please please please!
Mar 27, 2014
#35 pda...@opera.com
Could someone from chromium team confirm/deny that there are no significant tasks in progress in the involved code, please? I would like to avoid situation in which I would work on a code that is going away. Thanks.
Mar 28, 2014
#36 asanka@chromium.org
#35: No significant changes are planned for the SavePackage et. al. code in the short term. 
Apr 1, 2014
#37 pda...@opera.com
In the following simple TC:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN">
<html>
<head>
<script type="text/javascript">

window.onload = function() {
   var elm = document.getElementById('iframe_elm');
   var iframe_doc = elm.contentDocument;
   iframe_doc.open();
   iframe_doc.write("<a href=\"some_url\">Link</a>");
   iframe_doc.close();
};

</script>
</head>
<body>
<iframe id="iframe_elm"></iframe>
</body>
</html>

the iframe ends up with the same url (Document::m_url) as the top document. For me it was the 'file://' with the path in my filesystem. That url is set during document.open() execution. Then after trying to save the page, the mentioned DCHECKs were failing, that is one inside SaveFileManager::UpdateSaveProgress. However the main file for this page is saved correctly. But it seems like a matter of time to make a real min TC for this bug out if it.

That makes me wonder - is there any particular reason why the iframe ends up with a parent's doc url in this case? Because if not and we substitute that url e.g. with some constant, suggesting that it came out of document.write(), then we will have a workaround for most severe cases of this bug...

I'm working with a real TC of http://otomoto.pl/audi-q7-3-0-tdi-wersja-7-osobowa-C32393658.html

and this seems to be the reason. From the comments it seems that is the reason for most if not all the cases of missing main webpage file after saving.
Apr 7, 2014
#38 pda...@opera.com
I have fixed the bug by nicely workarounding the flawed saved page architecture. The goal was to fix the most cases of the bug and not regressing anything. The change will be submitted after our tester(s) confirm that they couldn't find any real regression.
May 29, 2014
#40 abobri...@gmail.com
I've read a bit of the code review above. Please do NOT drop this patch. This bug has been going on for ages... I know it's a typical complaint, and sorry for that. I wish I could contribute further, but I'm just one of the MANY people on the web having such a basic problem with such an advanced browser like Chrome.

Just search for people with the same issues - there are thousands, and the bugs are ancient.

Please - don't drop the patch. Worst changes have been added to Chrome before. It kept crashing for 3 months in a row after a commit killed its compatibility with a video card I owned.

Please? With a shiny pixel on top?
Apr 14, 2015
#41 ste...@gmail.com
This bug still hasn't been solved...please FIX!
It's ridiculous I can't save a complete webpage and not even get informed of the error having occured..I just find myself missing the webpage that I thought I had saved afterwards.
Sign in to add a comment

Powered by Google Project Hosting