From xen-devel-bounces@lists.xen.org Mon Feb 17 12:28:55 2014 Received: (at maildrop) by bugs.xenproject.org; 17 Feb 2014 12:28:55 +0000 Received: from lists.xen.org ([50.57.142.19]) by bugs.xenproject.org with esmtp (Exim 4.80) (envelope-from ) id 1WFNJf-0000Co-Ox for xen-devel-maildrop-Eithu9ie@bugs.xenproject.org; Mon, 17 Feb 2014 12:28:55 +0000 Received: from localhost ([127.0.0.1] helo=lists.xen.org) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1WFNEn-0006kJ-21; Mon, 17 Feb 2014 12:23:53 +0000 Received: from mail6.bemta5.messagelabs.com ([195.245.231.135]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1WFNEk-0006jT-Ki for xen-devel@lists.xen.org; Mon, 17 Feb 2014 12:23:50 +0000 Received: from [85.158.139.211:39811] by server-15.bemta-5.messagelabs.com id A4/B6-24395-55FF1035; Mon, 17 Feb 2014 12:23:49 +0000 X-Env-Sender: George.Dunlap@citrix.com X-Msg-Ref: server-5.tower-206.messagelabs.com!1392639827!4300474!1 X-Originating-IP: [66.165.176.63] X-SpamReason: No, hits=0.0 required=7.0 tests=sa_preprocessor: VHJ1c3RlZCBJUDogNjYuMTY1LjE3Ni42MyA9PiAzMDYwNDg=\n X-StarScan-Received: X-StarScan-Version: 6.9.16; banners=-,-,- X-VirusChecked: Checked Received: (qmail 25309 invoked from network); 17 Feb 2014 12:23:49 -0000 Received: from smtp02.citrix.com (HELO SMTP02.CITRIX.COM) (66.165.176.63) by server-5.tower-206.messagelabs.com with RC4-SHA encrypted SMTP; 17 Feb 2014 12:23:49 -0000 X-IronPort-AV: E=Sophos;i="4.95,860,1384300800"; d="scan'208";a="101401189" Received: from accessns.citrite.net (HELO FTLPEX01CL02.citrite.net) ([10.9.154.239]) by FTLPIPO02.CITRIX.COM with ESMTP; 17 Feb 2014 12:23:47 +0000 Received: from ukmail1.uk.xensource.com (10.80.16.128) by smtprelay.citrix.com (10.13.107.79) with Microsoft SMTP Server id 14.2.342.4; Mon, 17 Feb 2014 07:23:46 -0500 Received: from elijah.uk.xensource.com ([10.80.2.24]) by ukmail1.uk.xensource.com with esmtp (Exim 4.69) (envelope-from ) id 1WFNEf-0007HG-Qe; Mon, 17 Feb 2014 12:23:45 +0000 Message-ID: <5301FF51.1060509@eu.citrix.com> Date: Mon, 17 Feb 2014 12:23:45 +0000 From: George Dunlap User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.2.0 MIME-Version: 1.0 To: Jan Beulich , Yang Z Zhang , Tim Deegan References: <20140210080314.GA758@deinos.phlegethon.org> <20140211090202.GC92054@deinos.phlegethon.org> <20140211115553.GB97288@deinos.phlegethon.org> <52FA2C63020000780011B201@nat28.tlf.novell.com> <52FA480D.9040707@eu.citrix.com> <52FCE8BE.8050105@eu.citrix.com> <52FCF90F020000780011C29A@nat28.tlf.novell.com> <20140213162022.GE82703@deinos.phlegethon.org> <5301F000020000780011CCE0@nat28.tlf.novell.com> In-Reply-To: <5301F000020000780011CCE0@nat28.tlf.novell.com> X-DLP: MIA1 Cc: "andrew.cooper3@citrix.com" , Xiantao Zhang , "xen-devel@lists.xen.org" Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram X-BeenThere: xen-devel@lists.xen.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="us-ascii"; Format="flowed" Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org On 02/17/2014 10:18 AM, Jan Beulich wrote: >>>> On 13.02.14 at 17:20, Tim Deegan wrote: >> At 15:55 +0000 on 13 Feb (1392303343), Jan Beulich wrote: >>>>>> On 13.02.14 at 16:46, George Dunlap wrote: >>>> On 02/12/2014 12:53 AM, Zhang, Yang Z wrote: >>>>> George Dunlap wrote on 2014-02-11: >>>>>> I think I got a bit distracted with the "A isn't really so bad" thing. >>>>>> Actually, if the overhead of not sharing tables isn't very high, then >>>>>> B isn't such a bad option. In fact, B is what I expected Yang to >>>>>> submit when he originally described the problem. >>>>> Actually, the first solution came to my mind is B. Then I realized that >> even >>>> chose B, we still cannot track the memory updating from DMA(even with A/D >>>> bit, it still a problem). Also, considering the current usage case of log >>>> dirty in Xen(only vram tracking has problem), I though A is better.: >>>> Hypervisor only need to track the vram change. If a malicious guest try to >>>> DMA to vram range, it only crashed himself (This should be reasonable). >>>>>> I was going to say, from a release perspective, B is probably the >>>>>> safest option for now. But on the other hand, if we've been testing >>>>>> sharing all this time, maybe switching back over to non-sharing whole-hog has >>>> the higher risk? >>>>> Another problem with B is that current VT-d large paging supporting relies >> on >>>> the sharing EPT and VT-d page table. This means if we choose B, then we need >>>> to re-enable VT-d large page. This would be a huge performance impaction for >>>> Xen 4.4 on using VT-d solution. >>>> >>>> OK -- if that's the case, then it definitely tips the balance back to >>>> A. Unless Tim or Jan disagrees, can one of you two check it in? >>>> >>>> Don't rush your judgement; but it would be nice to have this in before >>>> RC4, which would mean checking it in today preferrably, or early >>>> tomorrow at the latest. >>> That would be Tim then, as he would have to approve of it anyway. >> Done. > Actually I'm afraid there are two problems with this patch: > > For one, is enabling "global" log dirty mode still going to work > after VRAM-only mode already got enabled? I ask because the > paging_mode_log_dirty() check which paging_log_dirty_enable() > does first thing suggests otherwise to me (i.e. the now > conditional setting of all p2m entries to p2m_ram_logdirty would > seem to never get executed). IOW I would think that we're now > lacking a control operation allowing the transition from dirty VRAM > tracking mode to full log dirty mode. Hmm, yes, doing a code inspection, that would appear to be the case. This probably wouldn't be caught by osstest, because (as I understand it) we never attach to the display, so dirty vram tracking is probably never enabled. > And second, I have been fighting with finding both conditions > and (eventually) the root cause of a severe performance > regression (compared to 4.3.x) I'm observing on an EPT+IOMMU > system. This became _much_ worse after adding in the patch here > (while in fact I had hoped it might help with the originally observed > degradation): X startup fails due to timing out, and booting the > guest now takes about 20 minutes). I didn't find the root cause of > this yet, but meanwhile I know that > - the same isn't observable on SVM > - there's no problem when forcing the domain to use shadow > mode > - there's no need for any device to actually be assigned to the > guest > - the regression is very likely purely graphics related (based on > the observation that when running something that regularly but > not heavily updates the screen with X up, the guest consumes a > full CPU's worth of processing power, yet when that updating > doesn't happen, CPU consumption goes down, and it goes further > down when shutting down X altogether - at least as log as the > patch here doesn't get involved). > This I'm observing on a Westmere box (and I didn't notice it earlier > because that's one of those where due to a chipset erratum the > IOMMU gets turned off by default), so it's possible that this can't > be seen on more modern hardware. I'll hopefully find time today to > check this on the one newer (Sandy Bridge) box I have. So you're saying that the slowdown happens if you have EPT+IOMMU, but *not* if you have EPT alone (IOMMU disabled), or shadow + IOMMU? I have an issue I haven't had time to look into where windows installs are sometimes terribly slow on my Nehalem box; but it seems to be only with qemu-xen, not qemu-traditional. I haven't tried with shadow. -George _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel