More medium/long term Typo goals

Piers_C · August 1, 2006, 11:12am

Remember this:

http://www.mail-archive.com/[email protected]/msg02284.html

Well, once I’ve checked in my current local branch which implements a
feedback specific state machine, we’ll have hit most of those goals,
so it’s time to think about a few more.

Outstanding/Short Term

Finish working through the implications of the new state machine
based content ‘state’. Essentially, ‘published?’, ‘spam?’ and a few
other flags are now delegated to the content’s state. Any behaviours
on an object that do anything like:

if self.published?
…
end

should have the ‘…’ turned into a method on the state objects and
get rewritten as

self.state.meaningful_name_for_some_behaviour
Something similar applies to some of the things the controllers are
doing. Structural code (where controllers treat objects as nothing
more than data structures) is anathema, behaviour should be pushed
to the model where it makes sense. Controllers shouldn’t access the
state directly, so state dependent behaviour should be pushed to the
model first and then delegated to the state object.
Various of the delegated query methods could usefully be added to
the model’s table, to help with searching. For instance, it would be
useful for the feedback page to list only those items which are
probably spam. Which means adding ‘spam’ and
‘classification_is_certain’ (lousy name, need to find something
better) booleans to the contents table to help with
searching. (Controllers don’t (and shoudln’t) know about the state
objects, they just know about the query methods, so we need flag
fields corresponding to the query methods which can be used as find
conditions)
Make page caching work right. Still on my todo list, but not quite
so urgent as it was. It’s in this section because it’s one of the
things that haven’t yet been done from the last goals post.
Give the theming section a long hard look. Yay! We have Scribbish in
the core now. However, theming still needs looking at. Wouldn’t it be
great
if you could just drop a theme in vendor/plugins and have it appear
in Typo’s list? Wouldn’t it be handy to support theme specific
configuration? Woudln’t it be useful if all this was documented?
Blog settings. Hmm…
Text filters not being controllers.
Scott’s doing some work on making url_for work without having to get
at the current controllers all the time. Once that’s done it should
be a great deal easier to turn textfilters into models. Again, it
would be good to be able to write textfilters so they could be
dropped into vendor/plugins

New stuff

Authentication. I’ve posted about this before, but we should really
support, at the very least, OpenID as well as our own internal users
table.
Authorization. Once we can authenticate, we can authorize. Some
blogs might choose require authentication before allowing anyone to
comment, or automatically mark any unauthenticated comments as
PresumedSpam
Pluggable Spam classification. Right now, when feedback is created,
it’s in the ‘Unclassified’ state. Saving it ‘collapses’ it to one of
‘PresumedSpam’ or ‘PresumedHam’. The method for doing this
classification is currently hardwired. First we check our own Spam
Protection library, then, if it’s turned on, we ask Akismet.

Which is all very well, but what happens if we decide we want to use
Authentication as a factor in classification? What if someone writes
a captcha plugin (which gets no nearer to Typo than vendor/plugins
dammit)?

So, we need to think about making the classification system into a
dynamic pipeline attached to the blog. My current thinking is that
this would be configured like sidebars are now (though possibly not
at the model level, about which more later). The administrator would
be presented with handy drag and drop interface and drag
classification tools into a pipeline. So, I might have a pipeline
that looks like:

is_logged_in → article_age → blacklist → akismet

While someone else could have

captcha → article_age → blacklist → presume_ham

each engine in the chain would look at the feedback and return one
of :ham, :spam or nil, where returning a symbol halts the
classification process, otherwise the feedback goes to the next
classifier in the chain. Something similar could be done at the
point that spam is firmly classified by the administrator; each
engine would have their #report_as_spam/ham(feedback) method
called and they’d do the reporting as they chose.
Multiblogging. I’m so in two minds about this. I think it’s going to
happen, and I don’t think it’s going to have an enormous impact
on the performance or copmlexity of the rest of typo. Introducing
the blog object has proved to be The Right Thing, for all its
teething problems.
RESTful API. I like REST. It just makes sense to me. But Typo
isn’t all that RESTful. Article permalinks are, of course,
sacrosanct, but pretty much everything else is fair game. In the
long term, I’d like to see the back of admin/* in favour of moving
administrative behaviour up into top level controllers and slightly
more complicated access control. This is definitely branch
territory, if only because the support for the sort of thing I’m
thinking of has only recently gone into edge rails.
Migrations. Migrating is hard. It’s hard to keep the migration
scripts up to date. So, I propose firming up the various message
posting protocols we have, documenting them, and then sticking to
them both in our controllers and any migrations we use. I envisage
using something like the Atom API as our basic posting protocol.
Implement the Atom API.
Use the rails plugins directory. If you write textfilter, or a
sidebar, you want some easy way of distributing it. And we want not
to have to stick it in the core distribution. It seems that the best
way to do this is to enable writing plugins that can be installed in
the same way as any other rails plugin. I think that this can
already be done, but it’s completely undocumented. We need to
investigate this and, if it will work with the current state of the
art, we need to write generate tasks to allow a plugin developer to
do:

./script/generate sidebar|textfilter|theme

and have a framework dropped in place so she can get on with the
hard work of making it do something interesting.
Investigate other blogging engines’ plugin architectures. See if
we’re missing any capabilities and what we’d need to do to import
any useful stuff into Typo.

Hmm… that’ll probably do for now. Did I miss anything?

Piers_C · August 1, 2006, 2:21pm

Typo-list mailing list
[email protected]
http://rubyforge.org/mailman/listinfo/typo-list

Piers_C · August 1, 2006, 3:43pm

Alastair R. [email protected] writes:

marked as presumed spam, merely unpublished. It took me a while to
realise that real comments were being published and that spam was
being held.

Except that’s changed with the last checkin. Now the presumed stuff is
marked as Spam? and Ham? and the confirmed stuff is marked as Spam and
Ham.

Also it seems that you (blog admin) need to go and remove unpublished
(ie spam) comments periodically. They seem to appear on the articles
otherwise. Or am I doing something wrong?

They shouldn’t be appearing on the articles (assuming they were posted
after the spam checking code went in). And one thing I mean to add is
an option to say that marking something as definite spam should just
report that to Akismet and then delete the feedback.

Some other suggestions:

Ability to configure multi-column sidebars

Wah! I knew I’d missed something.

A wordpress-like “dashboard” of recent comments, incoming links,
stats (number of articles, comments, etc)

Hmm… we’re not doing much in the way of intelligent stats gathering
yet, so I’m not sure how easy ‘incoming links’ would be to do, but I
like the idea of a dashboard.

Ability to specify a license on a per-page or per-article basis
(with a blog-wide default obviously) which would generate the right
HTML and RDF for easy inclusion in themes. I’m thinking specifically
of making it easy to add a creative commons license to a blog.

The problem with per-article licensing is how you sort out the
licensing of the front page. Probably doable with a sidebar that works
in a similar fashion to the current amazon sidebar (walks the list of
contents being rendered on the current page extracting book
citations/links and then builds the sidebar). But I’m still not sure
how it would choose the frontpage license. Ho hum. Worth thinking
about though.

Here’s one I forgot:

Investigate splitting the contents table into ‘contents’ (for pages
and articles) and ‘feedback’ for comments and incoming
trackbacks. The tricky part would be doing the split, and deciding
whether it’s a requirement to maintain article ids (if you do decide
to do that, you enter a twisty little maze of RDBMSes, all
different, as you try to make sure that new ids get generated
correctly once you’ve done the conversion) or to break any
/articles/read/n type links. I’m beginning to think we need to do
this, articles and feedback have a fair amount in common, but the
differences are becoming more distinct with each migration.

Piers_C · August 3, 2006, 1:12pm

Just a heads up, while I do intend on getting back into working on new
Typo development (I know I’ve been doing barely anything lately), but
for the next 3 weeks or so I’m going to be away (WWDC, vacation, etc). I
may be on #typo a little, but a lot of the time I won’t be around. I’ll
still respond to email, though.

–
Kevin B.
http://kevin.sb.org
[email protected]

Piers_C · August 4, 2006, 12:22am

Kevin B. [email protected] writes:

Just a heads up, while I do intend on getting back into working on new
Typo development (I know I’ve been doing barely anything lately), but
for the next 3 weeks or so I’m going to be away (WWDC, vacation, etc). I
may be on #typo a little, but a lot of the time I won’t be around. I’ll
still respond to email, though.

Continuing that theme, I’m off to Sidmouth Folk Festival 'til Tuesday,
when I’ll stick my nose in at the London Ruby U. Group
meeting. Expect me back on the Typo horse on Thursday or Friday next
week when I may be starting what I’m thinking of as a ‘new sidebars’
branch with a todo list that goes something like:

Get rid of all the current sidebars and any infrastructure that
supports them, simply support the current helper method for
displaying them (I don’t want to break my themes, but the helper
will probably just respond with ‘this space intentionally left
blank’ for the time being.
Experiment with various ways of providing sidebars that don’t use
components to do their magic. Which thought is, even now, hurting
my head.
Work out which one is least horrible and most testable.
Commit it to the trunk, port our currently supported sidebars to
the new scheme and make everyone who has their own sidebars to
support cry.
Wait 'til everyone has stopped crying and ported their sidebars
Write a handy adapter that makes old style sidebars work with the
new scheme.

Please note that steps 2 through 6 are highly speculative.

Piers_C · August 3, 2006, 12:55pm

I mostly agree with Piers. Here’s a short version of my list (short,
because I’m on vacation this week):

Finish the url_for cleanup. All Content model objects should
support a permalink_url (or similar) method that returns a link to the
permalink for the object. Maybe an edit_url and delete_url,
too–that’d clean up some of the views.
Helper cleanup–there are zillions of duplicate ways to get URLs and
so forth buried in out helpers. Clean them up, standardize, optimize,
and document.
Add rdoc where appropriate.
Query optimization–the front page (sans sidebars) should take 3
queries: blog_id, article count, page 1 article bodies. We’re down to
~30 now, from 300 late in the 4.0 dev cycle. Sidebars should be
cleaned up where possible, so they’re fast.
Optimization. Faster, faster, faster. With less RAM.
Remove components. They were a nice idea. At least according to
the Rails docs. However, it’s unclear that anyone else uses them for
anything. We’ve certainly found substantial bugs in them that no one
else caught. Plus, they’re slow. Make it possible to bundle
sidebars, themes, and text filters into Rails plugins, and build some
sort of infrastructure to make this easy for users. Then move most of
the non-core sidebars out of the main tree.
Clean up the admin UI. Everyone has ideas on this. I’d love to see
a few mockups.
Threaded comments. I miss mine. I have a model that is fairly
simple, I’ll show it to people later.
OpenID consumer and producer. Typo should be able to use OpenID to
authenticate comment posters and act as an OpenID producer so people
can use their blog as their identity.
Lightweight permissions model. I’d like to be able to keep
commenter preferences in the user table, so we can send out email
notifications when people follow up on their comments. I DON’T want
a 50-table complex permissions model.
Multiblog support. Ignoring the permissions problem, we’re 95% of
the way there already.
Better Jabber/XMPP support. I’d love to be able to send Atom over
XMPP. There’s a spec for it out, although it has issues. I’d really
love to merge OpenID, some sort of Typo registry website, and XMPP
to make cross-Typo comments and trackbacks much more powerful then
they are now.
Atom Publishing Protocol. I love the Atom publishing protocol. Go
play with the Google C. API for an example of what APP is able
to do. There’s a bit of a problem with it and Typo, though–our text
filter model doesn’t really fit into APP’s view of the world. I
talked with Tim B. about it a bit at OSCON, and I think I see a way
out.
Standardized import and export scripts. I’d love to be able to
say ‘typo import /some/path wordpress /tmp/foo’. Similarly, we could
standardize some sort of Atom-based Typo export format.

In all honesty, I don’t expect to get most of this done for 4.1. I’d
be happy if we could do a bit of permissions work, fix the URLs, ditch
components, and speed things up. I’d really like to have 4.1 out
during 2006.

Scott

Piers_C · August 6, 2006, 8:28am

Actually, we have atom exporting already (I assume by exporting you mean
a feed). The atom support we need is the atom blog API. I have a branch
sitting around (on the comptuer in a box winging its way back to MA that
I won’t see for two weeks) that was intended to work on that, but I
didn’t get much work done back then. If nobody’s taken it up in the next
few weeks I may re-start my attempt.

On Fri, Aug 04, 2006 at 08:00:23PM +1000, Alastair R. wrote:

much like rocket science.

I’m intending to just wade in, hacking and slashing, but any
suggestions appreciated.

–
Kevin B.
http://kevin.sb.org
[email protected]
http://www.tildesoft.com

Piers_C · August 4, 2006, 12:05pm

On 03/08/2006, at 2:55 AM, Scott L. wrote:

I mostly agree with Piers. Here’s a short version of my list (short,
because I’m on vacation this week):

[… snip a fantastic list of enhancements …]

Is there any low-hanging fruit for rails- and typo- newbies to
attempt? I’m willing to have a look at Atom-based exporting (seeing
as I raised it on your blog :), on the assumption that it is not too
much like rocket science.

I’m intending to just wade in, hacking and slashing, but any
suggestions appreciated.

Piers_C · August 7, 2006, 5:36pm

I’d love to see an Atom-based blog export/import standard. Does
anyone know any of the WP guys?

It seems to me that the rational thing to do would be to have a “pure”
Atom feed, including HTML for each entry, and then add an ‘export’
namespace and use tags like ‘export:description’ to include the
original markup, as well as other fields that don’t map into standard
Atom (comments open, trackbacks, etc). I’m not sure if static
content, comments, and trackbacks would be better as more entries with
a special type flag or as their own tag.

I suspect that we could get a few prominent non-Typo XML people
involved in this pretty quickly, if we can sit down and produce a
first draft of the standard.

Scott

Piers_C · August 7, 2006, 2:00pm

On 06/08/2006, at 4:24 PM, Kevin B. wrote:

Actually, we have atom exporting already (I assume by exporting you
mean
a feed).

Actually, I meant more than just a feed.

I don’t know about anyone else, but I find it a lot easier to put my
data somewhere that I know I will be able to get it out again. This
applies especially for web applications hosted by third-parties,
where export facilities are a must before I’ll even look at it, but I
believe it is generally a good practice. The ability to export is
something that is missing from Typo. You can get your data out again,
but it’s in the database schema du jour.

Hence the motivation for an export facility for Typo. This is an
escape hatch, if you will, for migration to an unspecified future
blogging platform.

At present there is no common interchange format for blog content.
The closest we get is Atom, and I’m not even sure whether that is
entirely suitable for representing all of the interesting content of
a blog. Anyway I think it’s worth having a crack at it.

Unlike the current Atom feed, the Atom export facility would:
a) include a complete archive, that is all of the articles, pages,
comments, trackbacks, etc in the blog
b) use the ‘raw’ formatting (eg markdown) instead of HTML
c) probably ignore attachments (unless someone has a better idea?)

On Scott’s blog I mentioned the possibility that this could be
expanded to fill the role of a backup/restore format, and he was
fairly sure he wanted these to be completely separate. Which seems
reasonable.

Piers_C · August 8, 2006, 1:41pm

On 08/08/2006, at 1:36 AM, Scott L. wrote:

I’d love to see an Atom-based blog export/import standard. Does
anyone know any of the WP guys?

Sorry I don’t. And not to put a downer on this idea but one of the
reasons I migrated to Typo from WP was that WP had been sitting still
with its Atom support (it is still on 0.3 AFAIK). But yes we should
get the other blogging engines involved.

It seems to me that the rational thing to do would be to have a “pure”
Atom feed, including HTML for each entry, and then add an ‘export’
namespace and use tags like ‘export:description’ to include the
original markup, as well as other fields that don’t map into standard
Atom (comments open, trackbacks, etc).

Yep, except probably “typo:” might be a more appropriate namespace
prefix (yes I know these aren’t signficant :).

Also, using the original source of each entry could be used instead
of the resulting HTML and it would still be valid Atom. This might be
simpler for the user to select whether they want the raw or cooked
(ie HTML) content to be included in their export, rather than have
the importer work out which of two alternative representations is
more compatible.

I’m not sure if static
content, comments, and trackbacks would be better as more entries with
a special type flag or as their own tag.

Yes, another tricky one. Comments and trackbacks would also need to
indicate which article (or page?) they are attached to.

I suspect that we could get a few prominent non-Typo XML people
involved in this pretty quickly, if we can sit down and produce a
first draft of the standard.

Agree.

I can volunteer to make a start on this.

Piers_C · August 8, 2006, 2:47pm

On 08/08/2006, at 9:38 PM, I wrote:

Yep, except probably “typo:” might be a more appropriate namespace
prefix

… but not if we want this to be used by other blogging engines,
which is of course the point of the exercise.

Sorry I don’t know where my head was when I suggested that.

I agree to go with “export:” as a namespace prefix for now.

Piers_C · August 8, 2006, 3:50pm

:-).

I know a few of the 6A folks, but I’m not sure that they’re the right
place to start with this. There has to be a blog-implementers list
out there somewhere.

Scott

Piers_C · August 8, 2006, 5:33pm

On 8/8/06, Dominic M. [email protected] wrote:

I agree to go with “export:” as a namespace prefix for now.

No. You go with “FGYRKJY” as the prefix (because it doesn’t matter) and
“http://some.where.central/export” as the namespace URI.

:-).

Scott

Piers_C · August 8, 2006, 10:39pm

I really should re-read it, but I’m pretty sure the Atom blog API
stuff deals with getting the raw data out from the blog so you can
edit it. And you can request more than just the last 10 or 15 posts or
so - you can request chunks going back until the beginning.

So I would hold of discussion on this until we can verify whether the
Atom blog API does in fact do everything you want.

Piers_C · August 8, 2006, 5:30pm

On Tue, Aug 08, 2006 at 10:44:53PM +1000, Alastair R. wrote:

I agree to go with “export:” as a namespace prefix for now.
No. You go with “FGYRKJY” as the prefix (because it doesn’t matter) and
“http://some.where.central/export” as the namespace URI.

-Dom

Piers_C · August 9, 2006, 3:28pm

I just read the APP spec and I couldn’t see any discussion about raw
vs cooked (as I like to call it) content.

I think it is unlikely that they (APP spec writers) will add
additional fields to the Atom entry document as alternative
representations of the content, but hey that’s just my guess.

However it is pretty clear that APP builds upon Atom, and the
additional functionality provided is not particularly relevant to the
task of exporting blog content. For example the entries in a
“collection” are retrieved as an Atom feed document, with only minor
extensions: the paging extensions, which are largely irrelevant for
our purposes, and the “publishing control” extensions, which look
useful (“draft” status for entries).

The point being that we probably don’t need to wait for APP to be
completed before starting work on blog exporting using Atom. Which is
good, because I have almost finished a first draft proposal on this
subject!

Piers_C · August 8, 2006, 11:06pm

I don’t remember seeing it when I read the spec last year, and Tim
Bray thought that filtering and formatting was a known problem when I
asked him a couple weeks ago. It doesn’t really make a difference in
this case, though-we’re talking about serializing the whole blog to an
Atom-based XML file, which is kind of orthogonal to the Atom
Publishing Protocol. I don’t believe there’s any standardized way to
handle comments and trackbacks via APP, so building a generic blog
sync tool via APP isn’t really possible, either.

Scott

Piers_C · August 9, 2006, 5:49pm

The nice thing about XML specs like Atom is that you can trivially
extend them with namespaces. The down side is that XML is a pain, and
namespaces are doubly painful. At least they’re easy to parse.

Scott

Piers_C · August 15, 2006, 2:18pm

As a follow up to this goal, I’d like to see better error handling in
the sidebars. I’ll help too – I just wanted to talk over the issue
before writing a bunch of code in seclusion.

I was recently contacted by someone who was having trouble configuring
their sidebars, and most of the problems came down to bad input values.
He was frustrated because the values were accepted, but nothing was
displayed in his blog, and no warnings appeared in either the admin
console or the server logs.

After looking at some of the RSS based sidebar plugins (Flickr,
delicious, etc), there’s a lot code that looks like it was
copy-and-pasted. I’ve started figuring out how to refactor the common
RSS code into one place. Sometime in the next few days I will have
collected enough coding time to have a patch available for review.

Piers, let me know how I can help best without stepping all over your
toes.

Thanks,

Tim

Piers C. wrote:

Continuing that theme, I’m off to Sidmouth Folk Festival 'til Tuesday,

Wait 'til everyone has stopped crying and ported their sidebars

Write a handy adapter that makes old style sidebars work with the
new scheme.

Please note that steps 2 through 6 are highly speculative.

–
Timothy F.
http://digital-achievement.com