Clean up "gotchas" before exporting #export


Tom H
 

This topic is intended to cover some things that you might be better off cleaning up on your Wikispaces site before exporting, especially if you are more comfortable with the tools available in Wikispaces than you are in your new host. I'm discovering them with each attempt I make with my comparative trials, inspecting this page or that or examining import logs or comparing statistics for things like the  number of files in my export versus the number imported. Of course, if you can execute repeated imports on your new host, you may learn of things that are problematic from your Wikispaces site that correspond with nothing I've encountered with mine.

Broken links

To conserve storage on Wikispaces in the period my site was on the Free Plan (what was the limit? 100MB?), I put a lot of images onto postimage.org and link-pasted them into my Wikispaces pages, e.g. <img src="https://s6.postimg.org/ozq0j20s1/Groups_Population.png" alt="As ... -- I've just discovered that postimage.org has revamped its site/storage servers and these are all now broken. So they will still be broken after migration. When, where to fix?

Now that my Wikispaces site is on a fee plan, space is not an issue, nor does it appear to be with any of the hosts I am looking at, with the possible exception of Classic Google Sites (100 MB although you can put files onto Google Drive). So another question is whether to fix links or to download the file from the external host and upload it to the wiki host? If the latter, to the old one or the new one? That decision may be driven by how intensively your site is used, by reports of broken links from users, by when you plan to migrate and how much time it will take to fix the breaks.

In my case, migration becomes unavoidable by the end of September and I have not yet decided on the new platform and host. And I'm hoping that there will be still more improvements to the Wikispaces export so I'm inclined to fix first. And given the volatility of image hosts, I might just as well transfer files from them to Wikispaces to be included in my final export package.

So the next step is to find the links to postimg.org on my Wikispaces pages. To my surprise and initial shock, Wikispaces Search tool does not look into hyperlinks so it found nothing. I then used Google Search with this term "postimg.org" site:sqlitetoolsforrootsmagic.wikispaces.com. That was better; 28 images were shown (will they disappear next time the site is crawled if I don't fix the links) but only two different pages, neither of which proved useful (one was different dates for Changes and the other had no external images - baffled why it was in the search results). The 28 image results are a very effective way of getting to the page needing a fix, as long as they don't disappear, but I think the surest way to find all pages with a known systemically broken link such as this is by searching through an export of the Wikispaces site.

I've been exploring the HTML export more extensively than MediaWiki or Dokuwiki because I know HTML better and because it is the medium I've had to use for multiple trial migrations, e.g., static HTML site, WordPress hosts, EditMe. Like @lectrichead, I use Notepad++ to search the whole folder of HTML files exported by Wikispaces; its regular expression search tool is very powerful and fast. Using the regular expression search term <img src="http[^"]+postimg.org resulted in 31 hits in 15 files in the export's "mainspace" folder that holds the exported page files. I haven't checked if the 31 image links are unique or if there are some repeats and Google found them all but now I have a list of pages to work through. 

Something similar can be done for other types of links. And there are utilities that crawl a website to find broken links - useful to use both before and after migration.
  
Tom
Looking to move SQLite Tools For RootsMagic from Wikispaces

Join main@Wikispaces-Refugees.groups.io to automatically receive all group messages.