lord63's blog

Yet another zhihudaily web

2015-04-24

WHAT

是的，我又写了一个知乎日报的网页版 =。=

web: http://zhihudaily.lord63.com/
github: https://github.com/lord63/zhihudaily

对于我的这个网页版的知乎日报，我的定义是：信息的再展示。比如说我的文字 UI，就是把当天的日报标题以一个列表形式展示出来，内容还是指向官方的日报；图片 UI 是在文字 UI 的基础上，把一篇文章和一张图片搭配在一起展示；分页 UI 是在图片 UI 的基础上，将多天的日报内容一起展示；至于三栏 UI，则是站在一个更高的角度上，进行更大范围的阅读和查找。

你可以分别尝试一下各个 UI，挑选你喜欢的。对于我来说，我其实是用 Kindle 看的（噗噗噗）。对于经常看日报，有习惯的同学，文字 UI 就已经足够了，要是偶尔翻翻，那么三栏 UI 可能会比较方便。对于图片 UI 和分页 UI 来说，我自己可能用到的不是特别多。至于为什么有那么多 UI 选择，其实是我想尝试一下各种不同的阅读方式啦 =v= 而且更加讽刺的是，我估计我 80 % 的时间都是花在前端上了（真.前端苦手）

WHY

从我小本子上的记录情况来看，我是 2014.06.04 就有想法要自己写一个了，最初的起因是因为看到有人用 go 写了一个 GO-ZhihuDaily. 真正动手是 2015.02.09.

HOW

得益于知乎日报的 API，开发工作是相对比较方便和快速的。我所知道的有两个 http://news.at.zhihu.com/api/4 和 http://news.at.zhihu.com/api/1.2, 我这里用的是第二个，第一个可能比较适合移动端的，你可以尝试尝试。

下面我继续说(以第二个 API 为例来说的):

http://news.at.zhihu.com/api/1.2/news/latest 得到当天的新闻，json 格式
http://news.at.zhihu.com/api/1.2/news/before/DATE 得到之前新闻，json 格式

json 格式说明:

    {
        "date": "20150210",
        "news": [
            {
                "title": "去过还想去的奥林匹克，酷热又荒凉的死亡谷（多图）",
                "url": "http://news-at.zhihu.com/api/1.2/news/4517537",
                "image": "http://pic4.zhimg.com/87324b315fc1674451b56011ed5ead9d.jpg",
                "share_url": "http://daily.zhihu.com/story/4517537",
                "thumbnail": "http://pic2.zhimg.com/d4691da85d8f7c4a845bcac5ec49f479.jpg",
                "ga_prefix": "021019",
                "id": 4517537
            },
            {
                 # 标题
                 # json 格式，文章内容是 html 的，应该是移动端用的。
                 # 640*640 大图
                 # 网页形式请用这个，直接跳到官方的知乎日报文章
                 # 150*150 小图
                 # Google Analytics 使用
                 # 文章 id， 是 url 和 share_url 链接最后的数字
            }
            ...
        ],
        # 是不是今天，只有今天才会有 top_stories, 使用 before/<date> 格式请
        # 求得到的则没有 'is_today' 和 'top_stories'
        "is_today": true,
        # top_stories 中的新闻就是 news 中的，用于移动端界面顶部 ViewPager
        # 滚动显示的显示内容.
        "top_stories": [
            {
                "image_source": "Yestone.com 版权图片库",
                "title": "一群大佬都让我们警惕人工智能，这不是危言耸听",
                "url": "http://news-at.zhihu.com/api/1.2/news/4517980",
                "image": "http://pic2.zhimg.com/6d6f508e1bb6ff1f2e39c40423feb4f4.jpg",
                "share_url": "http://daily.zhihu.com/story/4517980",
                "ga_prefix": "021007",
                "id": 4517980
            },
            {
                ...
            }
        ],
        "display_date": "2015.2.10 星期二"
    }

可以看到，我们可以从 API 中得到日期，所有的文章及其相关信息。这里我写一个最小型的 demo 例子给大家一个感觉。

目录文件结构:

zhihudaily/
    zhihudaily.py
    templates/
        index.html

zhihudaily.py

    #!/usr/bin/env python
    # -*- coding: utf-8 -*-

    from flask import Flask, render_template
    import requests

    app = Flask(__name__)


    @app.route('/')
    def index():
        session = requests.Session()
        session.headers.update({'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux \
                                x86_64; rv:28.0) Gecko/20100101 Firefox/28.0'})
        r = session.get('http://news.at.zhihu.com/api/1.2/news/latest')
        display_date = r.json()['display_date']
        news_list = r.json()['news']
        return render_template('index.html', display_date=display_date,
                               news_list=news_list)


    if __name__ == '__main__':
        app.run()

index.html

<!DOCTYPE html>
<html lang="zh-CN">
  <head>
    <title>Zhihu Daily</title>
  </head>
  <body>
    <p>{{ display_date }}</p>
    <ul>
      {% for news in news_list %}
      <li><a href="{{ news['share_url'] }}">{{ news['title'] }}</a></li>
      {% endfor %}
    </ul>
  </body>
</html>

然后 $ python zhihudaily.py 就可以了,样子如下图。是的，这么几行代码，一个最小型的知乎日报就在你手中诞生啦！你可以接着继续做，能够翻到前一天后一天，以及各种你想要实现的功能都可以接着做。

demo

关于其他的具体实现不再细讲，关键的基础实现有了的话，大家各自即可玩出自己的花样来，代码实现不再是要点，有兴趣可以到 github 上自己看。

Attention

访问 API 的时候加个 UA，不然可能得到这个提示

'Please stop crawling this site, thanks.'
直接链接日报的图片会有一定机率的 403 forbidden，我的解决方法是用 redis 作为一个图片缓存，将我写的知乎日报的网页版上的图片指向我的服务器，然后向日报官网请求并缓存到 redis 中，如果你不写图片 UI 和分页 UI 的类似用到图片的，你完全不用理会这一点，因为我们的链接是指向日报的，日报自己加载自己的图片当然是不会有问题的。如果你有更好的办法欢迎交流讨论 QAQ

Finally

最后是我的开发日志和我当时收集的资料，你可以自己试着写一个 :)